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ABSTRACT 

This report on language testing research focuses on 
lessons applied to limited English proficient (LEP) students and 
programs. First, a review of the history of primary and non-primary 
language testing is provided. The question of why there is no field 
of primary language testing is then discussed. The second major 
section of the report is a review of the broader literature of 
educational measurement as it relates to the critical role of 
language proficiency. The third section offers an idea of the place 
of language proficiency in a broader theory of human intelligence and 
representational capacities. Building on findings in non-primary 
language research, a possible resolution of the apparent controversy 
over the old notion of a single unifying general intelligence and 
distinct multiple intelligences is proposed. To conclude, a few 
observations about how to go about testing the increasing number of 
LEP students in schools are presented. Deep surface rather than 
surface assessment through discourse-based, real-life performances 
are recommended. Two responses to the paper, one by Fred Davidson and 
the other by Myriam Met, are appended. (VWL) 
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...man is not just a creature of accident, chained to and formed by 
the particular cave in which he is born.. ..No real teacher can 
doubt that his task is to assist his pupil to fulfill human nature 
against all the deforming forces of convention and prejudice... 
Moreover there is no real teacher who in practice does not be- 
lieve in the existence of the soul, or in a magic that acts on it 
through speech (Allan Bloom, 1987, The closing of the American 
mind: How higher education has failed democracy and impover- 
ished the souls of today's students, p. 20). 

For educators at large, probably the first and most important les- 
son learned from language testing research is that language profi- 
ciency (whether it is construed as a general factor or as a constella- 
tion of related abilities) is important in one way or another to nearly 
everything that takes place in education - whether at school or else- 
where. Language proficiency is a critical element in the process of 
becoming literate and all of the other public manifestations of human 
intelligence that enable us to be the social beings that we are. It is 
important to intrapersonal and interpersonal performances of all 
sorts. Language, perhaps more than any other aspect of our exist- 
ence, is what enables us to be members of a community that includes 
people other than ourselves. Perhaps I can be forgiven, as someone 
who comes partly from a foreign language teaching background, for 
stressing as enthusiastically as I do that proficiency in another lan- 
guage is like a key that opens a door to new worlds of understanding 
and provides access to new communities. However, if we remain in a 
permanent state of monolingual myopia -- which in its most perni- 
cious form is a terminal disease - language can be a wall that sepa- 
rates us from all the world beyond our particular primary language 
community. To the terminally monolingual, the wall is invisible, in- 
tangible, and seemingly non-existent. Yet is it as impenetrable as 
solid granite and forms a prison more secure than concrete and steel 
ever could. Electronic surveillance in the prison is altogether unnec- 
essary because the inmates are as unaware of their situation as 
Plato's inhabitants of the cave were of theirs. 

The good news, of course, is that by acquiring a language or two 
beyond our primary linguistic system, we can become more aware of 
our limitations, prejudices, and the inevitable ignorance that plagues 
all the denizens of all the caves, and to some extent, we can, it seems, 
escape the special prison of monolingual prejudice. With this desir- 
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able aim in mind, the insight that I want to develop— that language 
proficiency is central to all aspects of education - if it can be called 
an insight, will be news to no one in the bilingual education arena. 
Nor is it apt to make headlines with teachers who work with stu- 
dents of limited English proficiency (LEPs). Still, it is an insight that 
bears scrutiny and certainly criticism, and it epitomizes, I believe, 
what language testing research has to offer to a conference on evalu- 
ation and measurement issues relative to LEP students and the pro- 
grams that aim to serve them. With respect to the evaluation of pro- 
grams, a special sort of assessment problem, I concur with Prestine 
(1990) where she cites Rist (1982) who notes that program evaluation 
inevitably entails a general question that "is at once disarmingly 
simple and incredibly complex" - namely, "What's going on here?" 
(Rist, 1982, p. 440, and Prestine, 1990, p. 288). I'll try to show that 
language proficiency is a critical element in answering this general 
question not only in relation to individual students but also with re- 
spect to program evaluation. 

For the particular group of educators assembled at such a confer- 
ence as this one, I doubt it will be necessary to sell the idea that lan- 
guage proficiency matters. This is something that I assume we all 
agree on from the start. We may differ, however, in subtle and unan- 
ticipated ways on just how language proficiency matters and to what 
degree it matters. What I will attempt to do, therefore, is to elaborate 
on the ways in which language proficiency seems to matter according 
to the evidences afforded by theory and research. My analysis will be 
based on a selective review of the relevant literature. Underlying all 
of the discussion will be the ultimate aim of reaching some practical 
conclusions concerning how we ought to go about testing and evalua- 
tion in educational programs for LEP students. The best I can hope 
for is to affirm some of the good things that are already happening, . 
to offer some constructive (I hope) criticisms concerning theories ard 4 
practices that need mending, and to encourage us to capitalize still 
more on the rich linguistic resources that are coming to us in ever 
greater quantities from a pluralistic world of many languages. 

To that end, I would like to suggest that the first corollary of my 
starting premise, that language proficiency is a central element of all 
educational undertakings, might be that the term "limited-English- 
proficiency" implies a complement of "almost-unlimited-proficiency- 
in-some-other-language-or-languages." While I do not want to deny 
the benefits (or importance) of students acquiring a high degree of 
proficiency in English in these United States, I do want to suggest 
that it is strange that our educational systems and national policies 
(as diverse and amorphous as they may be; see Prestine, 1990, for a 
discussion of great interest) seem generally determined (at least in 
practice) to either ignore or to deliberately remove rather than to 
nurture and preserve the linguistic resources that are literally walk- 
ing into our schools at an ever increasing rate. Corresponding to the 
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common emphasis on limitations, disabilities, disorders, disable- 
ments, disenfranchisements, etc., it seems to me that there ought to 
be greater consideration of the positive complements of these terms. 
In this suggestion, I concur with Lynda Miller (1990) where she con- 
trasts her emphasis on "competencies" (taking her cue from the term 
"multiple intelligences'' as employed by Gardner, 1983 and seq.) with 
the more common "approach in which the emphasis is on deficits and 
disabilities" (p. 2) or on "impairments, handicaps, and disorders" (p. 
4). 

According to the positive complement of the "deficit approaches" 
— which might be properly called "empowerment approaches" - the 
attainment of language proficiency is perhaps the main road to social 
empowerment (Cummins, 1986). As Miller puts it (following Hirsch, 
1987): "being literate.. .is possessing shared background knowledge 
and holding positions of responsibility and power at the macro-levels 
of society" (Miller, 1990, p. 3). David Olson (1986) goes so far as to 
suggest that intelligence itself is hardly more than "literate compe- 
tence" (p. 338) or "the distinctive forms of symbolic systems evolved 
and exploited by a culture as a means for representing and acting on 
the world" (p. 345). 1 Even Walters and Gardner (1985) who think in 
terms of "multiple intelligences"; also see Gardner, 1983, 1989, 1990) 
say that in their later development "children demonstrate their abili- 
ties in the various intelligences through their grasp of various sym- 
bol systems" (p. 15). In fact, each separate intelligence, of the seven 
they advocate (which v/e review in part 3, below), is eventually seen 
"through a symbol system: language is encountered through sen- 
tences and stories, music through songs, spatial understanding 
through drawings, bodily-kinesthetic through gesture or dance, and 
so on" (p. 15). 

These ideas, though not identical with the view that I would like 
to advocate and develop here, still point, as I understand them, in 
the direction we ought to follow, and all of them tend to show the 
central importance of symbolic systems of which, I will endeavor to 
show (following C.S. Pierce [1839-1914]), natural language systems 
are chief. At any rate, all of the foregoing provides, I hope, a suitable 
preamble, a jumping off place, for the development of my main argu- 
ment which follows in four parts which I will preview immediately. 

I begin with (1) a review of the history of primary and non-pri- 
mary language testing and with a provocative question: how come 
there is no field of primary language testing? This quandary, will be 
resolved early in the discussion in a way that illustrates my starting 
point above about monolingual myopia. It turns out that there are in 
fact many approaches to the measurement and testing of primary 
language skills, but that nearly all of them have been mis-identified 
as pertaining primarily to some other actually incidental purpose. 
This was unlikely to be noticed, however, owing to the pervasive 
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monolingual myopia that has been prevalent for more than a century 
of public schooling and that still pervades the American educational 
scene. Until research on the testing of non-primary language profi- 
ciency began to bud in the late 1950s, hardly anyone ever thought to 
ask about research into the character of primary language profi- 
ciency. For this reason, the ideas to be gleaned from non-primary 
language testing especially, may be of some use to educators at large 
as well as those who work with the growing numbers of LEPs in our 
schools. 

In order to see the connections of research in non-primary lan- 
guage measurement with broader issues in education, the second 
major section of this paper is a review of (2) the broader literature of 
educational measurement as it relates to the central theme — the 
critical role of language proficiency. We will view that theme from a 
variety of angles and try to develop an up-to-date idea of where we 
are at present with respect to the unwieldy problem of measuring 
LEP students and evaluating the programs that purport to serve 
them. 

The third major section of the paper offers (3) a somewhat elabo- 
rated idea of the place of language proficiency in a broader theory of 
human intelligence and representational capacities. Along the way, I 
will try to point out general themes of agreement and certain con- 
trasting trends, e.g., the traditional views of general intelligence as 
contrasted with multiple intelligences as proposed back in the 1930s 
by L.L. Thurstone and others and revived and invigorated in recent 
years by Howard Gardner, Joseph Walters, Vera John-Steiner 
(1985), and others. Building on findings in non-primary language 
testing research, I will propose a possible resolution of the apparent 
controversy over the old notion of a single unifying general intelli- 
gence and distinct multiple intelligences. I will argue that these 
theories are not incompatible, but rather that they are complemen- 
tary ways of viewing different facets of distinctive human abilities. 

Finally, I will conclude with (4) a few observations about how we 
might go about the practical business of testing (and also of teaching) 
the increasing numbers of LEP students that are working their way 
through our schools. I will recommend deep rather than surface as- 
sessment through discourse-based, real life performances. 

( 1) Research in Primary and 

Non-Primary Language Testing 

In undertaking a review of research on language testing, as soon 
as we begin to talk about "non-primary language testing" we are 
bound to ask: Why is there no distinct field of primary language test- 
ing? The answer to this question is that many approaches to the 
business of measuring primary language skills do in fact exist, but 
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that they go by many different names. For instance, "intelligence 
testing" generally aims at primary language proficiencies and "verbal 
intelligence testing" specifically does so. Measures of listening and 
speaking abilities, speech and hearing tests, literacy tests of all sorts, 
but especially tests of reading vocabulary, reading comprehension, 
and writing proficiency tests clearly aim at primary language skills. 
In addition to the traditional categories of intelligence and achieve- 
ment tests, there are many deficit oriented categories of primary lan- 
guage assessment: e.g., tests of "language disorders," "learning dis- 
abilities," "mental retardation," and more recently many different 
sorts of "cognitive" and "metacognitive" tests, not to mention "linguis- 
tic" tests, "sociolinguistic elicitation devices," tests aimed at "dis- 
course abilities," "grammatical intuitions," "metalinguistic aware- 
ness," etc. I submit that there are many reasons why these various 
approaches to primary language assessment have not been recog- 
nized as a coherent branch of educational measurement, but none, I 
suppose, is more important than the general affliction of American 
educators with what I am calling here, monolingual myopia. I hasten 
to add that I am not saying that there are no important differences 
among the various fields of study listed in this paragraph, nor am I 
suggesting that primary language proficiency is the only object of in- 
terest. What I am saying is that all of the foregoing measurement ef- 
forts, and many others that I have not named, have as their princi- 
pal, unstated object, the measurement of one or another aspect of pri- 
mary language ability. 

Hakuta (1986) has done an excellent job of illustrating the 
misclassification of many immigrants to the United States ever since 
the early decades of the twentieth century. He traces deficit theories 
of bilingualism back to fallacious interpretations of "IQ" tests that 
were actually little more than measures of English proficiency. More 
recently, Gardner and Hatch (1989) observe that "linguistic and logi- 
cal-mathematical symbolization" predominate in both the curriculum 
and the school tests of "achievement, aptitude, and intelligence"(p. 
6). This same complaint against traditional approaches to the study 
of intelligence in particular is what has led Gardner (1983, 1989, 
1990) and his collaborators (also see Walters and Gardner, 1985, 
1986a, 1986b) to develop the theory of "multiple intelligences". How- 
ever, I submit that if it was the prevalence of monolingualism among 
the American educators that held the reigns of power from the early 
part of this century that set them up to misinterpret a mere lack of 
proficiency in English as a second language as a widespread intelli- 
gence deficit among children and adults from non-English speaking 
backgrounds. As Hakuta (1986) shows, immigrants in the early de- 
cades of the twentieth century were often described as "linguistically 
confused," "mentally retarded," "learning disabled," and so forth. By 
now it is clear that measures of yet to be acquired language skills 
were simply misidentified as indicating deficient cognitive powers of 
a much deeper sort. 



Moreover, as Ortiz and Yates (1983) have shown, the problem is 
far from solved as we approach the twenty-first century. In Texas 
alone, as recently as eight years ago, Ortiz and Yates found that His- 
panics were grossly over-represented (about 300 percent) in classes 
for the mentally retarded and other exceptionalities. Interestingly, as 
Cummins (1984) points out, the American Association of Mental Defi- 
ciency still depends on IQ scores (formerly one but now two standard 
deviations below the mean) as a part of its definition of "mental re- 
tardation" (McKnight, 1982). But why should anyone expect His- 
panic children to have a 300 percent higher incidence of mental re- 
tardation than other ethnic groups in Texas? What most of those His- 
panic children obviously have in common is Spanish rather than En- 
glish as their first language. A small percentage of them, probably no 
greater than the percentage in other ethnic groups, may have some 
form of genuine mental deficiency, but there is every reason to sup- 
pose that the vast majority of Hispanic children in Texas are quite 
normal in their general mental abilities. 2 Because so many of them, 
however, have been misidentified as exceptional we may suppose 
that some children with genuine difficulties have also been over- 
looked and are not getting the special educational they need. 

At least since the time of Francis Galton [1822-1911] (see Galton, 
1869) - Darwin's cousin and precursor of the modern intelligence 
testing movement - which is generally credited to Alfred Binet 
[1857-1911] (see Binet and Simon, 1905) language proficiency tests 
have often been misinterpreted as measures of something else. For 
instance, Binet himself wrote: 

One of the clearest signs of awakening intelligence among young 
children is their understanding of spoken language.. .(1911, p> 
186). 

He said that according to teachers of his day the best way to form 
an impression of a child's intellect was to "talk to him" (1911, p. 308). 
In fact, the Binet and Simon (1905) tests included such obvious lan- 
guage proficiency tasks as responding to commands (e.g., "Point to 
your nose"), repeating a phrase or sentence, naming objects, telling 
what's going on in a photograph, answering simple questions (e.g., 
"What's your name?" "Are you a boy or a girl?" etc), counting coins, 
copying a phrase or sentence, reading aloud and recalling points of 
information, writing phrases from dictation, defining words, etc.. All 
of this is relatively harmless so long as the language of the testing is 
the child's primary language system, but when it is not, difficulties 
arise. The nearly complete confounding of language proficiency with 
native intelligence persisted in the thinking of Binet who seemed to 
vacillate between the view that intelligence was distinct from ac- 
quired skills (Binet and Simon, 1905, p. 42) or that it was something 
that developed with "instruction" (p. 289). In the year of his death he 
wrote that children of higher standing manifest their "intellectual 
superiority" mainly "in tests where language plays a part" (p. 321). 
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The confounding of language proficiency with innate intelligence 
was especially apparent in a variety of fill-in-the-blank (cloze proce- 
dure) used by the German psychologist, Hermann Ebbinghaus [1850- 
1909]. According to David Harris (1985), as early as 1897, 
Ebbinghaus applied cloze procedure (more than half a century before 
its formal christening by Wilson Taylor, 1953) to meaningful prose 
with the intent of measuring the intelligence of school children. In 
the venerable tradition of Gestalt psychology, Ebbinghaus contended 
that intelligence involved linking elements so as to form coherent 
wholes. As paraphrased by Whipple (1915), Ebbinghaus is reported 
to have said: 

To measure intelligence, therefore, we must employ a test that 
demands ability to combine fragments or isolated sections into a 
meaningful whole. Such a test [that he called 
Kombinationsmethode] may be afforded by mutilated prose, i.e., 
by eliding letters, syllables, words, or even phrases, from a prose 
passage and requiring the examinee to restore the passage, if not 
to its exact original form, at least to a satisfactory equivalent of it 
(p. 285; also quoted in Harris, 1985, p. 367). 

Marion Rex Trabue, about 1914 according to Harris, claimed to 
have improved the procedure by applying it to isolated sentences. 
Trabue argued that using isolated sentences, rather than connected 
prose, allowed him to rank items by difficulty thus creating a near 
interval scale and giving higher reliability in scoring. While Trabue s 
insistence on using disconnected sentences was, in my estimation, a 
step backward from where Ebbinghaus began, Trabue was among 
the first to explicitly say that his tests were measuring "language 
ability" (Trabue, 1916, p. 1). In spite of this, Trabue-type fill-in tasks 
based on isolated sentences continued long afterward to be applied in 
so-called "intelligence" tests which were supposed to be measures, 
not of acquired language skills, but of innate abilities (e.g., tests by 
E. L. Thorndike, Lewis M. Terman, and others). 

Subsequently the various tasks recommended by Binet and oth- 
ers were reinterpreted, and alternately amplified and reduced sev- 
eral times, and were eventually canonized into various modern IQ 
tests (Binet and Simon, 1905; Terman, 1925; Terman and Oden, 
1947; Terman and Merril, 1960; Kaufman, 1979). The best known 
examples of IQ tests are divisible roughly into the categories verbal 
and non-verbal (or performance) tests. In the non-verbal category 
Raven's Progressive Matrices and CattelVs Culture Fair Test of Intel- 
ligence are often used. Batteries aimed at both categories, however, 
are also well known: e.g., the Thorndike-Lorge, the WISC-R, the 
Otis-Lennon Test of Mental Abilities, etc. 

Arthur Jensen of UC Berkeley fame (cf. Jensen, 1969, 1980) and 
Richard Herrnstein (1973; also Herrnstein and Wagner, 1981) of 
Harvard, extended the IQ testing movement, it would seem, to its 
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most extreme limits by claiming to be able not only to reliably deter- 
mine innate intellectual capacities but to distinguish races and eth- 
nic groups according to such measures. Most thinking persons find 
their reasoning spurious and their claims unconscionable — a kind of 
intellectual atavism harking back to racist theories of the philoso- 
pher Nietzsche and the idea of an intellectual aristocracy promoted 
in relation to the eugenics movement that began with Sir Francis 
Galton (1869). While such views have been severely criticized (and, I 
believe, properly so; see Mercer, 1973, 1984; and Gould, 1981), the 
best argument against them has largely been overlooked: namely 
that what the traditional intelligence tests measure best are acquired 
primary language skills. This idea is latent in the recent literature 
on "multiple intelligences," but has rarely been brought to bear as 
some believe it should (Oiler, 1991). For instance, Walters and 
Gardner (1985) say, "We speculate that the usual correlations among 
subtests of IQ tests come about because all of these tasks in fact mea- 
sure the ability to respond rapidly to items of a logical-mathematical 
or linguistic sort" (pp. 13-14). This very nearly amounts to saying 
that what those tests mainly measure is primary language profi- 
ciency (Oiler and Perkins, 1978). 

In spite of the long history of primary language testing from the 
early 1900s forward under the guise of IQ measurement, the notion 
of language proficiency per se, would progress little until empirical 
studies of foreign language proficiency began to appear in the late 
1950s. Among the first was Carroll, Carton, and Wilds (1959) show- 
ing that cloze procedure had some potential as measures of language 
proficiency. A spate of studies would soon follow (Carroll, 1961; Lado, 
1961; Valette, 1964; 1967) but it would not be until that latter part of 
the 1960s that non-primary language testing research would begin to 
flourish (cf. Upshur, 1967; Upshur and Fata, 1968; Spolsky, 1968a, 
1968b, Anderson, 1969; Upshur, 1969a, 1969b; Oiler, 1970; Oiler and 
Conrad, 1971; Savignon, 1971). From there forward, too many re- 
search reports, conferences, and books would be generated for them 
to be adequately covered in any single review. However, it would not 
be until June, 1984 that the first issue of the journal Language Test- 
ing would appear. By then certain general themes and trends had 
been fairly well defined and the many of the paths that are currently 
being followed out had been marked off. Rather than try to plod 
through the whole terrain, in what follows I will concentrate on what 
I think the most important themes were in the 1970s and 1980s and 
still are in the 1990s. 

It was John Carroll (1961) who suggested the distinction between 
discrete point approaches and integrative approaches to language 
testing. Discrete-point tests were grounded in the taxonomic ap- 
proaches to linguistics that would later fall into disfavor as the 
Chomskyan revolution (see Chomsky, 1956, 1957, 1965, 1972, 1975, 
1980a, 1980b, 1988) began to have its fuller impact into the 1970s 
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and 1980s (see Newmeyer, 1980). Discrete point tests were based on 
inventories (taxonomies) of various sorts of elements. For insl nee, 
the phonological system of a language was supposed to consist of 
phonemes which could be tested one by one. The lexicon was a list of 
words, and grammar (alias syntax) was a list of patterns. This taxo- 
nomic way of looking at language, and at human abilities in general, 
still prevails among many (though certainly not all) psychologists (cf. 
the numerous examples cited by Cummins, 1984), speech-language 
pathologists (Coles, 1978, and Cummins, 1986, document this claim), 
and educators in general (Cummins, 1984, 1986; Cummins and 
Swain, 1986; Bloom, 1976; Bloom and Krathwohl, 1977; Swanson, 
1988). 

According to the discrete-point model, a sufficient number of 
items aimed at elements drawn from the several inventories of pho- 
nemes, morphemes, lexical items, and syntactic patterns would as- 
sure a valid test of language proficiency. In the 1980s, this same 
taxonomical thinking would persist in lists of "notions" and "func- 
tions" of speech acts and discourse (cf. Farhady, 1983b, and his refer- 
ences). The latter extension was certainly a natural one, but it did 
not really depart from discrete-point theory. The purest varieties of 
such thinking, e.g., Lado (1961) contended that language test items 
should focus on only one skill (e.g., listening), and only one domain 
(e.g., phonology), and only one element (e.g., a particular phonemic 
contrast) at a time. Besides distinguishing domains of structure - 
phonology, morphology, lexicon, and syntax (semantics and pragmat- 
ics were not much thought of during the discrete-point heyday) - dis- 
crete-point testers also distinguished skills (listening, speaking, read- 
ing, and writing). It was claimed that a test item could not be very 
good if it mixed several skills and/or domains of structure. And this 
contention itself pointed to what Carroll (1961) called "integrative 
tests." 



For instance, Robert Lado (1961) contended that giving dictation, 
a foreign language testing technique popular with language teachers 
(cf. Valette, 1964; Finocchiaro, 1964), was not a good method because 
it mixed everything together. It was integrative rather than discrete- 
point (i.e., taxonomical) in its orientation. According to Lado, dicta- 
tion did not test phonemic contrasts since these were apt to be given 
away by lexical or syntactic context. It did not test words because the 
words were "given" by the person reciting the material to be written 
down. It did not test syntax since the syntax also was "given." Worse 
yet, according to discrete-point thinking, dictation mingled listening 
comprehension with writing and reading. It also mixed phonology, 
vocabulary, morphology, and syntax (not to mention semantics and 
pragmatics) into a potpourri. 

Discrete-point theory, however, in the final analysis was more of 
a hypothetical perspective than a practical one. Had it been influ- 
enced much by empirical evidence, it would have had to be radically 
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revised since language students in taking dictation do make many 
errors in just the domains that Lado claimed were not tested. For in- 
stance, in actual dictation protocols, we find evidence of phonemic 
contrasts that have been obliterated, for example, "collect" is apt to 
be rendered "correct" by an Asian writing a dictation in English. Or, 
complex consonant clusters of certain types of morphological inflec- 
tions are apt to be omitted in many cases. Furthermore, the same 
persons who make these sorts of errors in taking dictation are apt to 
make analogous errors in writing an essay, speaking, or other dis- 
course processing tasks. In fact, such problems carry over into rela- 
tively routine tasks such as repeating sequences of heard material, 
reading aloud, or even copying a text. 

Also, in taking dictation, word order is sometimes adjusted in 
surprisingly creative and ungrammatical ways. Lexical items are 
changed radically. For example, in one study at UCLA a passage on 
"brain cells" was rendered in an almost coherent way by one non-na- 
tive speaker of English as a text on "brand sales." Almost everything 
in the text was changed though a superficial phonetic resemblance 
remained between what had been dictated and what was written 
down. Less dramatic transformations of the same sort are commonly 
observed in dictation protocols (cf. Oiler, 1979, pp. 283-285, for sev- 
eral examples). 

As I argued in 1979 (p. 266) and continue to believe today, dis- 
crete-item tests do not accord well with what people do when they 
process text or discourse in normal ways. An example of a test exem- 
plifying early discrete-point, taxonomical theory that has been widely 
applied but without much success is the Carroll and Sapon (1959a) 
Modern Language Aptitude Test (also see their Manual, 1959b). 
Carroll (1967) found, in a massive study of college foreign language 
majors near graduation, that the MLAT was only a significant pre- 
dictor of foreign language attainment if extraneous variables such as 
interest, parental language background, and travel to the foreign 
country were included in the regression equations. Even with these 
extraneous variables added in, the MLAT still accounted for a mod- 
est 9 percent or less of the total variance in foreign language attain- 
ment. The several subtests of the MLAT itself, however, accounted 
for less than 1 percent of the total variance in foreign language at- 
tainment. More recently, Goodman, Freed and McManus (1990) 
agair found the MLAT to be a non-significant predictor of success in 
foreign language courses for 586 students tested at the University of 
Pennsylvania. They speculated that perhaps the failure of the MLAT 
in this case was due to the fact that language teaching seems to be 
moving more and more in the direction of integrative, whole lan- 
guage approaches. 

It is possible to find many examples of integrative tests that actu- 
ally proved more robust both in theory and in practice than discrete- 
item tests. These included dictation (Valette, 1964), essays (Briere, 
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1966), answering questions orally (Upshur, 1967, 1969a), telling a 
story (Politzer, Hoover, and Brown, 1974), giving a speech, conversa- 
tion or oral interview (ETS, 1970), reading aloud (Kolers, 1968), an- 
swering questions about a text (Politzer, Hoover, and Brown, 1974), 
repeating sequences from a text or narrative (also known as "elicited 
imitation"; Baratz, 1969; Politzer, Hoover, and Brown, 1974; Swain, 
Dumas, and Naiman, 1974), translating from LI to L2 or the reverse 
("elicited translation"; Swain, Dumas, and Naiman, 1974), etc. One of 
the various integrative types of task experimented with in the late 
1960s and early 1970s was cloze procedure — a method christened as 
such by Wilson Taylor (1953, 1956, 1957) for measuring readability 
of texts. It involves omitting words from a written (or possibly oral 
text) and requiring the examinee to replace the missing items 
(Anderson, 1969; Spolsky, 1968; Oiler and Conrad, 1971; Oiler, 
1973). 

As empirical research began to accumulate in the 1970s and into 
the 1980s it became clear that there were practical as well as theo- 
retical differences between integrative and discrete-point tests. Inte- 
grative tests were apparently measuring some traits and abilities of 
language users that discrete-point tests could not get at. Still, even 
into the 1970s there were some, Earl Rand of UCLA, for instance, 
who insisted that discrete-point methods were either better or at 
worst equivalent to integrative tests (Rand, 1972, 1976). These 
claims were rarely sustained in practice. If one had examined closely 
the empirical results, it would have become clear that greater reli- 
ability and greater validity generally accrued to tests falling toward 
the integrative end of the spectrum. 

Farhady ( 1983a) disagreed with this claim, but his examples 
were, as Oiler (1983b, p. 321 footnote a) pointed out, drawn from 
tests that were quite integrative in character. Therefore, when 
Farhady (1983a) claimed that there was no difference betvveer inte- 
grative and discrete-point tests with respect either to reliability or 
validity, he was really saying in effect that there is little difference 
between several about equally integrative tests. He was comparing 
reasonably good oranges with other reasonably good oranges. There 
were no truly discrete item tests in the inventory he compared. In 
any event, it is illogical to argue that the kind of test item that fully 
isolates a particular phonemic contrast, or a single lexical item, or a 
particular grammatical morpheme, or a syntactic rule, will yield re- 
sults equivalent to the sort of test that requires the employment of a 
vast system of such relationships - a whole grammar. If those two 
types of tests did turn out to be equivalent (which they are not, see 
also Damico and Oiler, 1980; and Damico, Oiler, and Storey, 1983), 
the result would be entirely anomalous as there simply is no theory 
whatever that predicts each an outcome. If a given phonemic con- 
trast, say, M versus /I/, is not in some sense distinct from, say, the 
syntactic transformation that copies the number of a referring head 
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noun onto its respective present tense verb and its demonstrative 
modifier, e.g., in "These recommendations are...", then the distinction 
between phonology and syntax must be misguided. But how? While 
tests of particular phonemic contrasts, or inflectional morphemes, or 
syntactic rules, might generate reliabilities in the range of .6 to .7 
(e.g., Evola, Mamer, and Lentz, 1980), tests of a more integrative 
character generally yield reliabilities about 10 points higher in the 
range of .8 to .9 (Oiler, 1972, for instance). Or consider the fourteen 
different integrative tasks used in research to calibrate the language 
question on the 1980 U. S. Census, none yielded a reliability lower 
than .98 (cf. Scott, 1979). 

It seemed to many, therefore, toward the end of the 1970s that 
integrative testing had prevailed over discrete-point approaches. 
However, this conclusion may have been premature. In the context of 
normal language processing, any given discrete-point item of interest 
may always be singled out for special attention in that context. On 
the other hand, a single element of any sort (a thoroughly isolated 
discrete-point) in the absence of the dynamic tensional context of dis- 
couise is like the sound of one hand clapping. Such discrete-points 
become mere fictions, like the dimensionless points of a line. Without 
the line, the points along it are dimensionless locations occupying 
space exactly nowhere. In context notions of discrete elements of lan- 
guage structure or skill are valuable theoretical constructs, but with- 
out the context, they are undefined fictions. 

Out of the controversy over discrete-point versus integrative 
tests, there emerged a distinction of a different sort. While the origi- 
nal dichotomy (proposed by Carroll, 1961) was based on superficial 
aspects of test items, domains of structure, and modalities of process- 
ing, it became increasingly clear that the distinction had been incom- 
pletely and inadequately drawn. Carroll (1961), Rand (1976), and 
Farhady (1983a) all observed that there never was a truly categorical 
difference between discrete-point and integrative test items. The dif- 
ference was merely one of degree. The dichotomy formed a con- 
tinuum whose end-points were fully distinct only in theory. In prac- 
tice, there are no completely discrete-point tests anymore than there 
are points or lines in the space/time continuum apart from some ob- 
ject or trajectory to define them. In actual experience all test items 
are more or less integrative in character. 

Normal language use always involves meaning beyond the theo - 
retically discrete elements of surface forms. That is, there is a linking 
with persons, places, things, events, relations, etc., in experience. 
However, if this meaning aspect beyond surface form is admitted, no 
test item can meet the demands of discrete-point theory. As I have 
hinted several times above, it may be worth saying straight out at 
this point that semantics and pragmatics were notably absent from 
discussions of discrete-point items. This was probably due to the fact 
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that meaning as such is never a discrete-point affair. It cannot be 
since meaning spills over into the whole continuum of experience 
which the very existence of meaning both presupposes and implies. 

Another insurmountable difficulty for discrete-point theory was 
that language use occurs in real time and is therefore time-con- 
strained. This is not so obviously true for reading and writing as it is 
for listening and speaking tasks. However, it is easy to prove with a 
little thinking that in fact there are severe temporal constraints on 
reading and writing as well as on oral tasks. Meanings that involve 
long-range constraints in a written text, for instance, are essentially 
inaccessible to persons who lack a certain level of language profi- 
ciency owing to the limited time that they can hold the target lan- 
guage material in working memory. If the requisite part of the 
memory image fades from consciousness before the part with which 
it must be linked can be grasped, it will be impossible because of this 
temporal fact to grasp the full meaning. 

Moreover, there are many other ways that real time constraints 
operate with reference to reading and writing in respects that are 
precisely analogous to temporal constraints on oral tasks. For in- 
stance, we may not have time to go and ask someone what So-and- 
So's last name is so we can look him up in the phone book. Or, we 
may not have time to drive to the library to look up a particular ref- 
erence for a research paper. We may spend hours looking for a cer- 
tain statement in a large book, or several volumes. These cases are 
hardly different from the problem of trying to recall some significant 
detail from a conversation (e.g., did he say to turn right or left on 
Oak Street?). In the final analysis, the salient differences between 
speech and writing seem less so when we look more closely at each 
one. Time and meaning, respectively, constituted the pragmatic 
naturalness constraints that led to a differentiation, therefore, of a 
certain subclass of integrative tests that came to be known as prag- 
matic (Oiler, 1973, 1979; Cohen, 1980; Savignon, 1983). This sub- 
class, it turned out, was entirely distinct from discrete-point tests. In 
fact, the pragmatic naturalness criteria eliminate any strictly dis- 
crete-point item as unnatural. Such items do not really involve nor- 
mal language use anymore than the recitation of a number or 
parroting a numerical operation constitutes mathematical reasoning. 

In addition, many tests that are thoroughly integrative in char- 
acter also fail to meet the pragmatic naturalness criteria. For in- 
stance, the proofreading test explored by Barrett (1976) was integra- 
tive but failed the meaning criterion. It involved the omission of mor- 
phologically redundant elements (e.g., plural markers, tense indica- 
tors, articles, prepositions, verb particles, etc.) from prose and re- 
quired the restoration of these elements by examinees. A peculiarity 
of the task was that fluent readers had to attend so much to the sur- 
face form of the text in order to notice the missing elements that they 
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failed to process the meaning of the text and after performing the 
task could not even tell what the text was about. On the other hand, 
examinees who did concentrate on the meaning, and who could an- 
swer reasonable questions about its content, would invariably get low 
scores. These results are consistent with the frequent observation by 
proofreaders that plying their trade slows down their reading. In 
fact, they often resort to rather unusual methods of checking surface 
forms such as reading the text backwards, or following it word-for- 
word while someone else reads aloud, and the like. These extreme 
measures are useful because proofreading requires a somewhat un- 
natural attention to surface form and good readers are often the 
worst proofreaders because they supply much information that is not 
in fact in the surface forms at all (cf. Goodman, 1967; Goodman and 
Goodman, 1977; Goodman, Goodman, and Flores, 1979; Smith, 1975, 
1978, 1982, 1984, 1989). 

Another procedure that is integrative but fails the time require- 
ment is the sort of multiple-choice cloze test where a list of many 
(say, 50 or more) words are given and must be reinserted, one by one, 
into a text with blanks. This task is highly integrative but may in- 
volve looking back and forth between the list and the text, and a con- 
stant rereading of the list. It may be more like solving a cross-word 
puzzle than normal discourse processing. Because of the frequent in- 
terruptions, in looking back and forth between text and list, and the 
time lapses while reading the list, it is doubtful that such a task con- 
stitutes a pragmatically viable procedure. At any rate, as the list of 
possible words becomes longer and longer, it is clear that the task 
resembles less and less the normal processing of discourse. 

What was more important about pragmatic tests, and what is yet 
to be appreciated fully by theoreticians and practitioners is that all of 
the goals of discrete-point items, e.g., diagnosis, focus, isolation, etc, 
could be better achieved in the full rich context of one or more prag- 
matic tests. As a result, it was argued that the valid objectives of dis- 
crete-point theory could be completely incorporated within a prag- 
matic framework. However, the goal of separating each and every 
element of structure or skill from the whole fabric of experience was 
abandoned. As an analytic method of linguistic analysis, the discrete- 
point approach may have had some validity, but as a practical 
method for assessing language abilities, it was misguided, counter- 
productive, and logically impossible to achieve. 

Another outcome of the discrete-point/integrative controversy, 
and the empirical research which it spawned, was a reconsideration 
of the almost forgotten ^-factor of Charles Spearman (1904, 1927). 
This development had two sides: one statistical and the other theo- 
retical. The statistical side of the argument was soon resolved 
against any all inclusive ^-factor, but the theoretical argument has 
yet to be adequately considered. 
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Charles Spearman had observed that most intelligence tests, in 
his day (and it may be noted that things have changed little since 
then; cf. Jensen, 1969, 1980) were strongly correlated. By inventing 
factor analysis, then a new statistical technique, Spearman showed 
that it was possible to identify a single general factor underlying 
most IQ tests and accounting for a huge chunk of variance in all of 
them. The same argument could still be extended to almost all ■ 
achievement, competency, and proficiency tests used in education 
today (see Oiler and Perkins, 1978, Gunnarsson, 1978, and Stump, 
1978, and for counterpoint and response, Carroll, 1983b, and Oiler, 
1983a; but see Gardner and Hatch, 1989 who claim to be able to mea- 
sure separate "intelligences" independently). This general factor 
came to be known as "g" or "theg-factor". Subsequently, L.L. 
Thurstone (1924, 1938, 1947; also Thurstone and Thurstone, 1941) 
and others, argued in favor of a plurality of primary mental abilities 
instead of a single g-factor of intelligence. They never settled how 
many primary factors there were or just how to define them. They 
vacillated in the end between six and eight distinct primary factors. 
In more recent years Guilford's "structure of intellect" model has 
multiplied these factors to 120 (Guilford, 1967). More recently still, 
Gardner (1983, 1989, 1990), Gardner and Hatch (1989), and Walters 
and Gardner (1985, 1986a, 1986b) have picked up the cudgel again 
on behalf of multiple intelligences. While Gardner and colleagues dif- 
fer in their particular list of "intelligences" from the "primary factors" 
proposed much earlier by the Thurstone's, there is a fundamental 
resemblance in both the arguments and applications of the ideas fa- 
voring profiles that look at the broad spectrum of a person's abilities 
rather than a single IQ score. 

However, long before Howard Gardner and colleagues came to 
the fray, it was generally admitted (by L.L. Thurstone himself, and 
more recently by his student J.B. Carroll and others) that underlying 
any set of primary factors or secondary or tertiary ones there will 
still be a general factor. A recent study of language proficiency by 
Fouly, Bachman, and Cziko (1990) concludes that a second order 
general factor and a model that allows differentiated components at 
the first order level are both fairly good at predicting observed rela- 
tions between different language measures for 334 ESL students at 
the University of Illinois. They refer to Carroll (1983a) who summed 
up both his results and those of Fouly, et al. (1990) in terms of the 
long term controversy over general versus specific factors in lan- 
guage testing research: 

With respect to whether the results support a "unitary language 
ability hypothesis" or a "divisible competence hypothesis," I have 
always assumed that the answer is somewhere in between. That 
is, I have assumed there is a "general language ability" but, at 
the same time, that language skills have some tendency to be de- 
veloped and specialized to different degrees, or at different rates 
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so that different language skills can be separately recognized and 
measured (p. 82). 

Fouly, et al. go on to say, "the present study provided support for 
the differentiated skills hypothesis recurrent in the works of 
Bachman and Palmer (1983), Carroll (1983a), Farhady (1983c), and 
Upshur and Homburg (1983).. ..Similarly, the findings of this study 
support the claim that, in addition to differentiated language skills, 
there exists a general factor" (p. 16). In support of the latter model 
they might have cited Oiler and Perkins (1978, 1980) and Oiler 
(1983a). A general factor of language proficiency (or what has been 
called "intelligence," in the case of tests of primary language abili- 
ties), cannot be denied on statistical grounds (Carroll, 1983a, 1983b). 

While at first multiple factors as contrasted with a general factor 
were thought of as mutually exclusive, this was never correct. The 
general factor, whimsically referred to as the Godzilla factor by 
Purcell (1983) could be useful in spite of the fact that it did not ex- 
haust all of the reliable variance in a number of language tests and 
even though could be transformed in a variety of ways into a multi- 
tude of component factors (see Farhady, 1983c; Upshur and Hom- 
burg, 1983; Bachman and Palmer, 1983; Vollmer and Sang, 1983). 
Godzilla, therefore, was prematurely proclaimed to be dead (by 
Purcell, Farhady, and others), and certain persons set out to bury 
him (Alderson and Hughes, 1981; Palmer, Groot, and Trosper, 1981; 
Porter, 1983; Spolsky, 1983; Alderson, 1983; Hughes and Porter, 
1983; Davies, 1984). But Godzilla refused to be buried. It was true 
that he was not quite tall and strong enough to embrace the whole 
world (i.e., explain all of the variance in all tests), but he was plenty 
large and strong enough to resist burial (Bachman and Palmer, 1983; 
Carroll, 1983a; Bachman, 1990; Fouly, Bachman, Cziko, 1990; 
Oltman, Strieker, and Barrows, 1990). 

Although some researchers continue to pursue the elusive goal of 
resolving the general factor into its "proper" components (Sang, 
Schmitz, Vollmer, Baumert, and Roeder, 1986; Bachman and Clark, 
1987; Bachman, 1990; Fouly, Bachman, Cziko, 1990), it would seem 
that a definitive division of language proficiency into its contributing 
components may be unachievable in principle by virtue of the fact 
that the multi-faceted semiotic hierarchy can be viewed from many 
complementary angles that logically should prove to be about equally 
correct (witness the findings of Fouly, et al. 1990). At any rate, the 
most important side of the argument is not statistical, but theoretical 
- the fundamental problem is to find a coherent theory and it is cer- 
tain that this cannot be achieved by purely statistical methods (see 
Bachman, 1990, pp. 296-358; Cummins, 1981; Krashen, 1981, 1982, 
1985; Carroll, 1983a, 1983b; Upshur and Homburg, 1983). Upshur 
(1979), Carroll, and others have shown that the componential resolu- 
tion of a general factor into a plurality of contributing components is 



not at all incompatible with the notion that language proficiency may 
be a fairly coherent and integrated totality. If we consider the mean- 
ing of total scores on tests with diverse subtests, or if we consider the 
fact that communicative abilities interact in complex ways to produce 
composite results, it is clear that both general and specific factors 
must be present in language proficiency. We will examine a few pos- 
sibilities in section 3 below in this paper. 

Aside from exploratory and confirmatory factoring of the traits 
(or theoretical constructs) that we may posit as aspects of human 
mental abilities or language skills (which I do not take to be the 
same thing, contrary to Boyle, 1987) and methods associated with 
particular tests, a number of interesting research reports using item 
response theory (IRT; following Rasch, 1980; see Davidson, 1988; 
Lynch, Davidson, and Henning, 1988; and Kunnan, 1990;) or multi- 
dimensional scaling (Oltman, Strieker, and Barrows, 1990; and 
Oltman and Strieker, 1990 following Guttman, 1965) have appeared. 
The common purpose of much of the research has been to sort out 
distinct sources of variance in language test scores. Among the 
widely recognized possibilities are three major sources as shown in 
Figure 1 below: (1) producers of discourse or text themselves differ in 
language abilities (and other mental abilities as well), as do (2) con- 
sumers, and as do (3) the texts or discourses (items in the case of 
many tests) that are both produced and understood. These three 
sources of variance can, of course, be further parsed up in a great va- 
riety of ways. One of the interesting and instructive avenues of re- 
search has been item response theory (IRT). Citing a single study 
will show how IRT can be applied to turn up unexpected sources of 
test item biases. 

Kunnan (1990) demonstrated with an IRT approach (using a one 
parameter Rasch model with approximately 844 subjects) that sub- 
jects of different native language backgrounds and gender differ in 
performance on certain language test items depending in part on the 
instruction they have received probably in their major fields of study. 
At any rate, differential item functioning (DIF) was observed on the 
150-item ESL Placement Examination at UCLA used in the Fall of 
1987 on about 15 percent of the items. Apparently, Davidson (1988; 
see footnote 1 on p. 742 of Kunnan, 1990) had already shown that the 
test items in question met the requirement of unidimensionality in 
order for one parameter IRT to be applied. Based on that assump- 
tion, Kunnan found that certain grammar items focussing on the 
definite article, one or more prepositions, and verb tense were easier 
for Chinese and Japanese subjects (than for Spanish or Korean sub- 
jects), though different items (three in each case) performed differen- 
tially for the two groups. Also four vocabulary items proved signifi- 
cantly easier for Spanish speakers: hypothetical, implication, elabo- 
rate, and alcoholics. 
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Figure 1 

The Three Main Sources of Variance 
in Language Test Scores 
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Since these words have Latin bases and cognates in Spanish with 
similar meanings, Kunnan credited native language background it- 
self with the observed DIF for these items. Additional differences 
were observed for gender on 20 items some of which seemed to differ 
according to the major field of candidates. Items oriented toward the 
sciences seemed to favor males. Three items that favored females 
could not be accounted for. The results are interesting insofar as they 
show that items may be unintentionally biased against or in favor of 
certain groups. However, remedies for preventing this sort of bias 
are not clear: Kunnan, for instance, recommends that "a broad range 
of test content and formats" may help to reduce instructional bias. As 
for gender and native language biases, these are more difficult to 
deal with. They can be spotted on a post hoc basis with IRT, and the 
items can then be rewritten, but it is not entirely obvious how the 
author's recommendation that demographic data be elicited in ad- 
vance might be used in test preparation. Certainly for items that re- 
main unexplained even after the post hoc IRT, a demographic ques- 
tionnaire or any sort of pre-screening even by members of the tar- 
geted examinees would seem unlikely to avoid the, for the moment, 
unexplained DIFs. The research is, in my view, nonetheless impor- 
tant as demonstrating the subtle kinds of test biases that can arise 
and the widely different sources variance that may constitute such 
biases. 

Similar, though somewhat more specific biases for Japanese 
learners of English as a foreign language are demonstrated experi- 
mentally by Chihara, Sakurai, and Oiler (1989). Our work used a 
more traditional repeated-measures approach but predicted in ad- 
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vance what sorts of items in a cloze passage were biased against 
Japanese learners of EFL. Because Japanese subjects were compared 
against themselves in a repeated measures design, the variance of 
interest in particular items can be attributed specifically to the cul- 
tural or experiential background of the subjects tested. Two cloze 
passages were each presented in two forms: each passage appeared 
in an unmodified biased) form and in a modified (reduced bias form). 
The method of modification was to change unfamiliar place names in 
the U.S. and Greece to familiar ones in Japan, and one instance of a 
mother kissing her son was changed to hugging (which is acceptable 
in Japanese culture). The results showed a significant advantage 
overall favoring the modified texts in spite of the fact that all else 
was left unchanged. The results, though based on an entirely differ- 
ent experimental procedure, agree with those of Kunnan (1990) us- 
ing IRT, in showing that items may function differentially according 
to the background of subjects. 

A rather different application of IRT comes from Lynch, 
Davidson, and Henning (1988). While Kunnan (1990) was interested 
in variance across items, Lynch, et al., focussed on variance within 
persons (on a different form of the same UCLA ESLPE examined by 
Kunnan). Lynch, et al., wanted to determine if variance within per- 
sons could also be regarded as unidimensional. It had been deter- 
mined in several prior studies that variance across items tended to 
be unidimensional. Both person variance and item variance need to 
be unidimensional in order for one-parameter Rasch models to be op- 
timally applicable. Like Oltman, Strieker, and Barrows (1990) - who 
used a different approach, multidimensional scaling (following 
Guttman, 1965) - the evidence obtained by Lynch, Davidson, and 
Henning (1988) seemed to show that unidimensionality may not be 
achieved until language learners gain some maturity in the target 
language. Their conclusion expresses this idea negatively: with refer- 
ence to violations of unidimensionality, they say that their results 
seem to support the notion that such violations are more serious at 
the lower end of the ability continuum (p. 218). 

Citing Oltman and Strieker, Lynch, et al. note that the few di- 
mensions detected tend to merge into a larger primary dimension at 
the upper end of the ability scale (p. 207). 

This same observation has been made by Oltman, Strieker, and 
Barrows (1990) on the basis of a different statistical technique (mul- 
tidimensional scaling). 

Whereas Lynch, et aL, studied responses of 678 subjects taking 
the UCLA ESLPE in the Fall of 1987, Oltman and colleagues studied 
53,169 subjects who took the Test of English as a Foreign Language 
in May of 1985. These results give fairly persuasive evidence that 
whatever factors or dimensions language proficiency may resolve 
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into probably do vary dynamically over time just as Clifford (1980) 
and Lowe (1980) predicted they would. In fact, Figure 2 suggests an 
abstract idea of the sort of thing that appears to be happening with 
the TOEFL and with the UCLA ESLPE as well. Whereas in the early 
stages of second language learning, distinct dimensions of listening, 
writing, and reading ability may be observed (and these may even 
resolve into further sub-component traits or categories), as learners 
progress to a more mature, native-like capacity in the target lan- 
guage, it seems that the diverse dimensions (factors, traits, or what- 
ever they may be called) tend to converge to a more unidimensional 
structure. 

Figure 2 

Hypothetical Convergence of Arbitrarily 
Designated Factors or Dimensions Designated 
a, b, c, ...z (traits, methods, or whatever) 
of Language Proficiency Viewed Over Time until 
Maturity is Attained. 




A tentative hypothesis may be offered: Perhaps the various di- 
mensions (whether attributed to persons or to items) that are sorted 
out by language tests (and observed in some detail through multidi- 
mensional scaling techniques) tend to converge on some more or less 
well-determined norm that is defined by the community of users who 
know and use the target language in question for the sorts of pur- 
poses that the language tests inadvertently characterize. There are 
good theoretical reasons to suppose that some sort of normative con- 
vergence must in fact occur in "normal" language acquisition. 
Whereas learners may vary considerably in the rate and degree of 
initial success in mastering all of the diverse aspects of a language 
system, the sounds and meanings of words, the syntax and semantic 




values of phrases and clauses, not to mention pragmatic applications 
in experience, must all tend toward more or less standardized norms 
in order for communication to be possible across the diverse members 
of any given language community. It is precisely in this sense, I be- 
lieve, that language tests must always to some degree be normative 
in principle. Criterion-referencing is not ruled out, but it will neces- 
sarily be incomplete unless supplemented by norm-referencing (i.e., 
specifically to the norms of the language community in question). 
Languages, whatever else they may be, are intrinsically, norms of 
symbolic behavior. We will return to this idea in section 3 below, but 
first it may be useful to examine some of the broader research on the 
measurement of human abilities in order to appreciate better the 
special role played by language abilities. 

(2) Review of Educational Measurement 

Modern variants, of the analytic approach typified by the dis- 
crete-point foreign language testing of the 1960s can still be found in 
abundance in the general literature of educational measurement. 
Kagan (1990) complains about the "atomistic view of effective teach- 
ing that emerged from the process-product research of the 1970s" as 
well as the mistaken notion that a teacher's competency can be de- 
fined entirely in terms of a "laundry list of behavioral objectives" 
(Howey and Zimpher, 1989; Kagan, 1990, p. 419). Of course, a review 
of the literature shows that the laundry-lists have not been limited to 
behavioral objectives for teachers but have been extended to every 
domain of the curriculum and every sort of testing - including tests 
aimed at intelligences, achievement, bilingualism, language disor- 
ders, etc. 

Nowhere is the atomistic, discrete-point approach more apparent 
than in the literature about how to construct "items." In fact, the 
analytic, taxonomical philosophy (reflecting little influence as yet 
from the Chomskyan revolution; e.g., see the numerous references to 
the taxonomy of Benjamin S. Bloom still prevalent in the literature) 
continues to hold sway in most educational and psychological testing. 
For example, Roid and Haladyna (1982) describe "the heart of what 
is currently known as CR [criterion-referenced! testing" as the notion 
that "a domain-based interpretation is possible only when a domain 
or universe of items has been created and the test is based on a 
sample from this domain" (p. 28). A domain, according to such think- 
ing, is conceived of as a list of potential items from which a sample is 
drawn in constructing a test. Roid and Haladyna (1982) attribute to 
Bormuth (1970) the idea that a technology of item writing might "be 
based on the transformation of sentences into questions" (p. 99). A 
domain, by this view, is a list of sentences. They acknowledge that 
the whole idea of sampling from a domain of sentences is susceptible 
to "serious objections" that arise in connection with "the meaningful- 
ness of definable universes" (p. 34). 




There are really two problems here: modern linguistic theory 
shows that the number of sentences in any given domain of interest 
for practical purposes is non-finite, and it also shows that any known 
method of algorithmically generating sentences will produce a great 
deal of nonsense. Roid and Haladyna (1982), without apparently un- 
derstanding the linguistic necessities, say "there is a chance for end- 
less mapping sentences, facts, and facet elements, with lack of agree- 
ment among developers being a major detriment to progress" (p. 
132). The non-finiteness of sentences about any given subject matter 
renders the idea of a "randomly selected representative sample" 
uninterpretable, and the abundance of nonsense that would be gen- 
erated by any known algorithmic procedure makes that approach 
relatively unappealing. Further, the recommendation (of Bormuth, 
1970, cf. Roid and Haladyna, 1982, p. 92) that all possible items in a 
domain be specified is logically (in principle) unattainable. For these 
and other reasons, I still believe (cf. Oiler, 1979, pp. 32-33) we need 
to look for an approach to educational and psychological testing that 
assesses the relative efficiency of a generative system (i.e., the sym- 
bolic system itself) rather than attempting to representatively 
sample from an unattainable listing of an infinitude of demonstrably 
infinite universes of particular sentences or test items. When the fo- 
cus is shifted from a list of items (a poor characterization in any case 
of any non-finite domain of sentences) to the generative basis which 
underlies the representations that constitute that domain, we have 
some hope of achieving both reliability and validity. While ap- 
proaches to educational and psychological measurement have yet to 
appreciate the purely theoretical implications of the Chomskyan 
revolution, happily a movement toward more pragmatic, holistic, 
testing is nonetheless discernible. 

Whereas Roid and Haladyna (1982) view individual test items as 
the "basic building blocks of tests" (p. ix), they implicitly take into ac- 
count the contrast between (1) discrete-point theory where individual 
items are matched with some abstract trait and a more pragmatic 
approach where (2) the tester/teacher thinks in terms of "a theory of 
the relations between a test and other variables in the real world (a 
nomological network)" (p. 8). The latter approach would seem to ad- 
dress the fundamental problem of pragmatic mapping (also known as 
abductive reasoning) to which we return in part 3 below. It is also 
refreshing to read in Roid and Haladyna (1982) that "testing is 
viewed as a part of instruction and not a separate operation" (p. 30). 
In this they follow the lead of people like Eva L. Baker (1980) who 
argues for a comprehensive "integrating" model of "teaching-learn- 
ing-assessment" (p. 14) where the various activities are merely 
viewed from different perspectives, but not as distinct and separate 
entities apart from the whole context of education. It is the articula- 
tion of a theoretical basis for such holistic, nomological. or pragmatic 
approaches, the author will argue in section 3, that is most needed. 




The author agrees with Gardner (1990) who cites Chomsky 
(1975) in support of the idea that the acquisition of various represen- 
tational abilities ~ though not always the more abstract academic 
ones that Gardner calls "literacy, numeracy, and critical thinking" — 
is natural and normally proceeds without a hitch. "Given environ- 
ments that are not grossly impoverished, all children will learn how 
to speak and understand their native languages (and other lan- 
guages in their surround) with ease and facility; acquire basic under- 
standings of the operation of the physical world (the constancy of 
matter, the principles of cause and effect); understand key aspects of 
the social world (the way to convince another individual, the detec- 
tion of benevolent or malevolent motivation); and use a range of sym- 
bolic codes, such as those involved in picturing, gesturing, and mak- 
ing music, in order to express and derive meanings" (pp. 89-90). Fol- 
lowing Chomsky, Gardner acknowledges that not only do children 
normally accomplish such things without special tutelage, but that 
"adults do not know how to teach [his italics] many of the most im- 
portant forms of knowledge which every normal child acquires" (p. 



Gardner in all of his recent writings stresses the partial indepen- 
dence of "intelligences." He says, "While such areas as reading, or 
studying history, or composing music may well be characterized by 
stages of competence, the stages found in one domain may have little 
resemblance to, or correlation with, those regnant in other domains... 
even in those areas of learning which appear to be universal, all 
forms of learning do not develop in synchrony. Rather, human beings 
differ in the manner in which, and the speed with which, they ex- 
press various mental capacities or 'intelligences' " (pp. 90-91). He 
points out that learners often exhibit what may be called "U-shaped" 
growth or learning curves. They seem to acquire a concept but fail to 
generalize it appropriately to new contexts or over-generalize it to 
contexts where it does not work. He argues that what is missing in 
such cases is what he calls "connecting tissue" that would relate ab- 
stract symbolic representations to the world of experience more ar- 
ticulately and more completely. In my terms, what is missing is the 
sort of pragmatic mapping that all genuine learning requires. Too 
much discrete-point, surface oriented materials passes for curricu- 
lum and yet does not achieve much effect. Students remain without 
the pragmatic linkages to their experience that would make sense of 
such materials. 

Gardner ( 1990) says that "so long as testing is geared exclusively 
to 'school knowledge' " — i.e., the surface-oriented, discrete-point, 
unintegrated variety - the "credentials provided by the school may 
bear little relevance to the demands made by the outside community" 
(p. 93). To remedy the situation, he is concentrating his efforts on de- 
veloping "new forms of assessment which are sensitive to particular 
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intelligences and which can document the kinds of learning that take 
place 'in context' in which students carry out projects of some scope" 
(p. 104; also see Gardner, 1989; and Gardner and Hatch, 1989). He 
says that "finding the topic or skill with which one feels Connected' is 
the single most important educational event in a student's life" (p. 
104; also Gardner and Walters, 1986a). 

In coming to his eventual list of seven basic intelligences, 
Gardner and colleagues examined several sources in the literature: 
(1) normals (2) pathological and special populations including such 
cases as autism, savantism, and learning disabilities. Gardner and 
Hatch (1989) claim that it is possible to escape the biased confines of 
"linguistic and logical skills" by developing what they call "intelli- 
gence fair measures" that "seek to respect the different modes of 
thinking and performance that distinguish each intelligence. Al- 
though spatial problems can be approached to some degree through 
linguistic media (like verbal directions or word problems), intelli- 
gence-fair methods place a premium on the abilities to perceive and 
manipulate visual-spatial information in a direct manner. For ex- 
ample, the spatial intelligence of children can be assessed through a 
mechanical activity in which they are asked to take apart and 
reassemble a meat grinder.... Although linguistically inclined chil- 
dren may produce a running report about the actions they are tak- 
ing, little verbal skill is necessary (or helpful) for successful perfor- 
mance on such a task" (p. 6). Here Gardner and colleagues seem un- 
aware of relevant research by A.R. Luria (1959, 1961, 1979; also 
Luria and Yudovich, 1959). Luria showed that the integration of ver- 
bal skills with certain motor tasks was essential to successful perfor- 
mance of those tasks for children at an early stage of development 
(e.g., being able to push a button consistently when a green light was 
on but not when a red light was on). 

Serendipitously, in keeping with caveats of pragmatic testing, 
however, Gardner and colleagues (e.g., Gardner and Hatch, 1989) 
recommend holistic, highly pragmatic assessment procedures: "even 
at the preschool level, language capacity is not assessed in terms of 
vocabulary, definitions, or similarities, but rather as manifest in 
story telling (the novelist) and reporting (the journalist). Instead of 
attempting to assess spatial skills in isolation, we observe children as 
they are drawing (the artist) or taking apart and putting together 
objects (the mechanic)" (p. 6). Their approach they admit "blurs the 
distinctions between curriculum and assessment" (p. 5) but this 
surely we must applaud. It falls in line with recommendations com- 
ing from a number of quarters these days for blurring not only the 
lines between teaching and testing but also between the school, 
home, and community (Simich-Dudgeon, 1987; and Quintero and 
Huerta-Macias, 1990). 
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Parent involvement is stressed by Quintero and Huerta-Maci'as 
(1990): they say, "the positive impact of parents' involvement in their 
children's education is well documented (here they cite among others 
Simich-Dudgeon, 1987 and Wells, 1986)" (p. 307). They point out that 
"instructional activities must not only be interactive in nature, but 
also rich in cultural meanings, comparisons, and critical analysis for 
making classroom and out of classroom connections" (1990, p. 312). 
Or, as Freire and Macedo (1987) put it, "the command of reading and 
writing is achieved beginning with words and themes meaningful to 
the common experience of those becoming literate, and not with 
words and themes linked only to the experience of the educator" 
(Quintero and Huerta-Macfas, 1990, p. 42). Or, from a different 
angle, Smith (1989) says, "individuals become literate not from the 
formal instruction they receive, but from what they read and write 
about and who they read and write with" (p. 353). Quintero and 
Huerta-Maci'as argue for a "whole language approach" (citing among 
others Bruner, 1984; Goodman, 1986; and Smith, 1984) they define 
it: "the whole language approach to language learning emphasizes 
that language be taught naturally as it occurs within any social envi- 
ronment instead of segmenting it into bits and pieces" (1990, p. 307 ). 
They recommend an experience-based approach appealing to the rich 
existing experiences of the family (Auerbach, 1989). 

However, it is important to keep in mind, as Miller (1990) 
stresses that the broader and deeper view of literacy that whole-lan- 
guage approaches advocate also suggests connections that have too 
long been neglected: "Literacy viewed from the perspective of com- 
munication arising from shared activities with meaningful others 
cannot be separated from the issues of intelligence, learning, and 
language.. .literacy becomes entwined with how and what people 
know - with intelligence" (p. 2). When this broader view is assumed, 
we may hope for better results in education. Quintero and Huerta- 
Macfas (1990) conclude: "In sum, because Project FIEL [Family Ini- 
tiative for English Literacy] stresses language use in meaningful 
context, the student's needs, wishes, and past experiences naturally 
become the teaching methodology, and flexibility of the curriculum is 
a natural result. Program goals are reached by students, parents, 
and teachers working together through interaction and learning for 
real-life needs. Finally, the experience of the project indicates that 
when social context is attended to in a positive way and the dignity 
of the learner is upheld, learning occurs" (p. 312). 

By using context-rich materials and activities that engage chil- 
dren more fully and challenge their "intelligences" more specifically, 
Gardner and Hatch (1989) report higher motivation and evidence of 
a greater diversity of abilities. They report on a study in 1988-1989 
with 20 preschool children who were tested on "story telling, draw- 
ing, singing, music perception, creative movement, social analysis, 
hypothesis testing, assembly, calculation and counting, and number 



notational logic" (p. 8). The authors conclude that only the activities 
requiring "logical-mathematical intelligence" proved significantly 
correlated with each other (r = .78, p < .01)." Their analysis, how- 
ever, may be more detailed than the small number of preschool sub- 
jects in their study would justify. In a follow-up with first graders, 15 
in all, again the conclusions are perhaps too general to be sustained 
by the small number of observations involved, but some evidence is 
provided showing that children do differ in expected ways on the dif- 
ferent intelligences posited. 

Walters and Gardner (1985) say that "each intelligence" (of the 
seven Gardner had previously identified) "must have an identifiable 
core operation or set of operations": for example "one core of Linguis- 
tic Intelligence is the sensitivity to phonological features" (p. 4). They 
say,"While it may well be possible for an Intelligence to proceed with- 
out an accompanying symbol system, a primary characteristic of hu- 
man intelligence may well be its gravitation toward such an embodi- 
ment" (p. 5). Of course, if we follow C.S. Pierce, we must suppose 
that a sign system of some sort is prerequisite to any intelligence 
whatever. Here is where some additional theoretical development, I 
believe, is needed. 

Another trend in the general educational-psychology literature 
that corresponds to a move away from atomistic analytic approaches 
and toward more holistic pragmatic procedures can be seen in stud- 
ies of language disorders and learning disabilities. Audet and 
Hummel (1990), for instance, give an interestingly pragmatic analy- 
sis of the discourse of a nine-year-old boy diagnosed as language- 
learning disabled and behaviorally disordered. In general, they fol- 
lowed the discourse analysis procedures recommended by Damico 
(1980, 1985a, 1985b, and 1991). Although, Adams and Bishop (1990) 
and Bishops and Adams (1990) did a less fine-grained analysis (see 
their comparison of their own with Damico's approach on p. 260), like 
Damico (1985b) they were also able to show substantial reliability for 
judgments of pragmatic appropriateness. The shared point in all 
these cases, however, was to give greater attention to pragmatic as- 
pects of discourse (an approach also advocated by Miller, 1990 and by 
Prutting and Kirchner, 1987). 

(3) Language Proficiency in Relation to 
a Theory of Intelligence 

The bulk of the research on intelligence measurement per se is 
only tangentially relevant to a theory of language proficiency in rela- 
tion to a comprehensive model of intellect. The IQ measurement re- 
search has been limited by its taxonomic character from the begin- 
ning and has scarcely begun to consider the full implications of the 
Chomskyan revolution. The fact is that psychology and psychomet- 
rics are yet to feel the force of generative theory. Taxonomic models, 
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e.g., Guilford's "theory of intellect" (1967) and Bloom's taxonomy 
(1976; also Bloom and Krathwohl, 1977), are not merely out of date, 
they are either incorrect in fundamental v/ays, or else, the genera- 
tive conception of grammar is entirely misguided. At any rate, the 
taxonomies, when compared against generative theories, cannot com- 
pete in scope or power. They are logically too impoverished to even 
begin to account for the facts of human language ability not to men- 
tion other semiotic capacities. 

On the other hand, the generative conception of grammar was 
implicit in much work before the Chomskyan era. Such a conception 
was apparent in Saussure's advocacy of a general theory of 
"semiology." Before that, C. S. Pierce [1839-1914], a scientist charac- 
terized by Ernest Nagel in 1959 as "the most original, comprehen- 
sive, and versatile philosophical mind this country has yet produced," 
had written the equivalent of 104 volumes of 500 pages each in oc- 
tavo, focussed primarily on the theory of semiotics. Pierce, more than 
any other scholar, worked toward a general theory of representa- 
tions. The essence of Pierce's conception of the relation between lan- 
guage and intellect is suggested by Albert Einstein (1941): 

Everything depends on the degree to which v/ords and word- 
combinations correspond to the world of impression. 

What is it that brings about such an intimate connection between 
language and thinking? Is there no thinking without the use of 
language, namely in concepts and concept-combinations for 
which words need not necessarily come to mind? Has not every- 
one of us struggled for words although the connection between 
"things" was already clear? 

We might be inclined to attribute to the act of thinking complete 
independence from language if the individual formed or were 
able to form his concepts without the verbal guidance of his envi- 
ronment. Yet most likely the mental shape of an individual grow- 
ing up under such conditions would be very poor. Thus we may 
conclude that the mental development of the individual and his 
way of forming concepts depend to a high degree upon language 
(1941, in Oiler 1989, p. 62). 

Pierce and Saussure, presumably for similar reasons, agreed in 
this assessment. Both of them contended that language is the canoni- 
cal semiotic medium and that by the systematic study of it we should 
be able to optimize our understanding of representational 
("semeiotic," Pierce's term, or "semiological," Saussure's term) pro- 
cesses in general. More recently Noam Chomsky has urged the same 
program. He wrote in 1972: "One would expect that human language 
should directly reflect the characteristics of human intellectual ca- 
pacities" (p. ix). 
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Figure 3 

Pragmatic Mapping of Representations onto 
the Facts of Experience via Abductive Reasoning 
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Figures 3-7 elaborate on this central theme. Figure 3 pictures the 
primary representational problem as outlined in the above remarks 
by Einstein, and more fully by Pierce in the Nineteenth Century. On 
the left hand side of the diagram the raw uninterpreted facts of expe- 
rience are pictured; on the right hand side, representations of them. 
The question for a theory of intellect is how the connection between 
the two realms is accomplished. This in a nutshell is the pragmatic 
mapping problem, or in Pierce's words it is the problem of abductive 
reasoning. It is construed, in the theory under consideration, to be 
the primary problem of intelligence. 

Einstein described this problem and defined the "gulf as shown 
in the following lines: 

...the concepts which arise in our thought and in our linguistic 
expressions are all - when viewed logically - the free creations 
of thought which cannot inductively be gained from sense experi- 
ences. This is not so easily noticed only because we have the 
habit of combining certain concepts and conceptual relations 
(propositions) so definitely with certain sense experiences that 
we do not become conscious of the gulf - logically unbridgeable - 
which separates the world of sensory experiences from the world 
of concepts and propositions (1944, in Oiler 1989, p. 25). 

Readers familiar with Chomsky's work will not fail to see the 
profound similarity between what Einstein says here and what 
Chomsky has said many times elsewhere. The idea that time repre- 
sentations are validly connected with whatever they purport to rep- 
resent, otherwise known as the correspondence theory of truth, is 
foundational to what Einstein is saying in the immediately preceding 
quotation. Moreover, it is implicit in many of the remarks of educa- 
tors concerning the need to relate what is talked about in the class- 
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room to the actual, real-life, real-world experience of students both in 
and out of the classroom. 

Probably the main reason that the Peircean or Einsteinian view 
of reality has not been more widely accepted by scholars is owing to a 
peculiar skepticism about our knowledge of the external world that 
still prevails in much modern thinking and education. MacNamara 
(1989) shows that modern approaches to human representations of- 
ten assume an extreme variety of such skepticism. In reviewing a 
collection of works representing some of the most widely read theore- 
ticians of the present decade (Umberto Eco, Roger Schank, Ray 
Jackendoff, George LakofF, and others), MacNamara (1989) com- 
plains that "the collection radiates skepticism about the capacity of 
the mind to know reality" (p. 350). While some of the authors see 
mental models as mediating between representations and the exter- 
nal world, others see them as being only in contact with themselves. 
Now it follows that if mental representations have only themselves 
or other mental representations as their ultimate objects, thinking is 
quite independent of any external reality, and must be regarded as 
essentially unrelated to our actions. Common sense and all logic re- 
jects this extreme view. On the contrary, we suppose that people are 
responsible for their actions in a way that inert objects and unrea- 
soning organisms are not and that the responsibility is based in the 
linking of representations with corresponding facts that have an in- 
dependent reality of their own. 

When a representation corresponds faithfully to a fact we say 
that the representation is true of that fact. This is the layman's defi- 
nition of truth and it does not differ in any essential respect from 
that of the scientist. However, some skeptics suggest that the very 
correspondence of a representation with a factual state of affairs is 
itself a fiction. For instance, Umberto Eco capsulizes this view in his 
chapter title, "On truth, a fiction" (in Eco, Santambrogio, and Viola, 
1988). While C.S. Pierce, whom Eco claims to follow, saw truth as a 
purely abstract quality of representations (which would give it the 
same immaterial quality as any fiction - thus making it fictionaZ), 
Pierce did not assign any extra degree of reality to material entities 
so the abstractness of truth would not detract in the least from its 
reality. On the contrary, while physical things, owing to the laws of 
thermodynamics come into existence in space and time, grow old, 
wear out, and are no more, the truth of any representation (e.g., that 
these words were written by yours truly in Albuquerque, New 
Mexico, at about 2:25 in the afternoon on August 4, 1991) is an eter- 
nal fact. It does not change over time. Therefore, for Pierce, truth 
was not a fiction, though it has the same abstract quality as a fiction. 
The difference between these views is like that between a libertarian 
skepticism on the one hand, and a responsible pragmatism (or what 
Pierce called "pragmaticism" to distinguish his views from those of 
William James and John Dewey) on the other. 
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I have mentioned skepticism because it is probably the prevailing 
view among theoreticians of the twentieth century in spite of the fact 
that the typical school teacher takes a more realistic approach. For 
instance, when educators and parents speak of relating classroom 
activities to the real world, they presuppose that a real world exists 
and that we have some more or less valid knowledge of it. Therefore, 
if whole-language, experience-based, socially relevant curricula are 
actually possible, the extreme variety of skepticism must be wrong. 

Figure 4 elaborates on the model by proposing a hierarchy of 
three distinct kinds of representational capacities: linguistic, kinesic, 
and sensory-motor. According to Pierce, the language capacity is 
fully abstract and may be used to represent any imaginable, or even 
unimaginable idea whatever. We may at least speak of the 
unimaginably fantastic. The kinesic, gestural, sort of representation 
is intermediate. It is conventional and arbitrary to some extent, but 
may also involve iconic (analogical) elements. For instance, a bran- 
dished fist suggests more or less iconically the act of punching some- 
one, but it may by convention acquire a rather different meaning ~ 
e.g., it may be a sign of solidarity or brotherhood. 

Or consider the fact that Americans and most western Europeans 
indicate themselves kinesically by pointing roughly at their own ster- 
num (the center of the chest) with the right index finger or thumb of 
the right hand. Japanese, however, point to themselves by touching 
or pointing toward their nose with the right index finger, palm 
turned inward toward the body. Each of these gestures has its con- 
ventional aspects as well as its universal basis in the ego-reference 
point. The latter is not a mere convention since it is physiologically 
impossible for a perceiver to have any other primary reference point. 
(Without the notion of one's own self, it would be impossible to credit 
any other self with existence or to differentiate the self from any 
other person; see Pierce, in Moore, et al., 1984, pp. 20 Iff.) 

Sensory-motor representations on the other hand are more or 
less directly, and iconically, related to the facts of experience. Per- 
sons skiing down a mountain not only represent the terrain ahead in 
a continuous flow of images but must also represent at some level 
body postures and internal commands for motor adjustments in order 
to control body and skis to accommodate the slope beneath them. 
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Linguistic representations by contrast achieve a higher level of 
abstraction and a closer approximation to validity. It is true that 
they must involve icons and indexes to the extent that they are syn- 
thetic in character, i.e., to the extent that they inform us about ac- 
tual experience, but their fundamental character pertains to their 
abstractness and near independence of anything external to them. 
While linguistic forms that depend on sensory-motor representations 
of non-linguistic states of affairs (e.g., factual or fictional contexts), or 
that appeal to indexical or deictic relations (e.g., pointing or naming 
or referring) involve the same kinds of degeneracy associated with 
icons and indexes respectively, the purely semantic values associated 
with words and propositions are quite impervious to either of those 
sorts of degeneracy. For instance, our concept of mortality idoes not 
deteriorate from one moment to the next in the way that our recollec- 
tion of a scene does. That is, the semantic value of a word or proposi- 
tion is not qualitatively degenerate. Nor does our idea of mortality 
depend on any particular instance of it that might be singled out for 
attention (e.g., the fact that Socrates died). In fact our abstract con- 
cepts (or the abstract meanings of words, propositions, and texts/dis- 
courses are not at all reactionally degenerate in the way indexes are. 
Therefore, Pierce argued, symbols are relatively genuine, i.e., pure 
and valid by comparison to icons and indexes. 

In addition to the fact that linguistic representations are prima- 
rily symbolic while gestures have an intrinsic indexical quality in 
many instances and sensory-motor representations are largely iconic, 
a few more words need to be said about the three main categories of 
semiotic systems. Because of their greater abstractness and symbolic 
character, linguistic representations and their underlying forms em- 
body certain cognitive powers of reasoning that the other two major 
classes of representations are not capable of achieving. For instance, 
there is no way that any iconic representation can express ad- 
equately the notion that human beings are mortal. Nor is it possible 
to express that idea strictly speaking in an index or any other sort of 
mere gesture. An abstract grammatical system capable of expressing 
a practical infinity of subject-predicate relations, negations, conjunc- 
tions of ideas, and the like is required to express fully what is meant 
by the fact that human beings are mortal or any other similarly com- 
plex abstract proposition. However, kinesic and sensory-motor repre- 
sentations also have certain special properties. For instance, an 
iconic representation, such as a visual representation of a scene, can- 
not be quite perfectly translated into words. The Chinese aphorism 
that a picture is worth a thousand words is an understatement. A 
picture is worth many more than a thousand words. Similarly, ges- 
tural systems have unique capabilities. Just as a picture is worth a 
thousand words, a single look, a facial expression or tone of voice 
may speak volumes. Affective information, it seems, the emotive side 
of human experience is far more effectively conveyed in facial expres- 
sion and tone of voice than it ever could be in words or images alone. 
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Therefore, each of the three major semiotic systems has its own spe- 
cial capabilities. Still, it must be said that language reigns supreme 
as commanding the greatest degree of independence from the mate- 
rial world and also, by far, the greatest degree of generality relative 
to its scope. We cannot visualize, hear, smell, taste, or feel every- 
thing we can talk about, nor can we express in paralinguistic mecha- 
nisms every idea we can talk about. On the other hand, we can talk 
about absolutely anything that is conceivable. Anything beyond our 
capability to represent in some oblique manner in words is simply 
beyond our conception altogether. 

So much for the three general headings under the overall intel- 
lectual ability termed "General Semiotic Capacity" in Figure 4. It re- 
mains to explain the terms subordinate to each of these. Under "Lin- 
guistic Semiotic Capacity," an ability that is believed to be innate and 
species specific to human beings, come terms that correspond to the 
grammars of particular language systems, L v L 2 , through L n . These 
systems, to the extent they are not already specified by innate knowl- 
edge of universal grammar, must be acquired if they are to be known 
at all. Each in its turn corresponds then to a class of textual repre- 
sentations in experience, t u , t, 2 , through t tn . These terms stand for 
the texts, for instance, that conform to one*s primary language, or 
second language, and so forth. For monolinguals, there will be no L 2 . 

The same sort of hierarchical arrangement is hypothesized under 
the "Kinesic Semiotic Capacity." It too is expected to be largely in- 
nate though not entirely species specific to human beings. Again, the 
universal kinesic capacity dominates (or branches into) a plurality 
(or at least a potential plurality) of subordinate acquired systems. 
Each of these subordinate systems dominates a class of texts or rep- 
resentational forms in experience, and these tend to be loosely tied to 
linguistic texts. For example, English speakers are apt to accompany 
the statement that a certain person is about "so tall" with a corre- 
sponding gesture, palm down, hand extended. A speaker of a differ- 
ent language may use a quite different conventional gesture for the 
same purpose. 

More importantly, research shows that the sequence of gestures 
is delicately coordinated with the sequence of linguistic forms and 
meanings. According to research by Condon and Ogston (1971) this is 
true not only of the speaker but also of the audience to such an ex- 
tent that their body movements appear to be under the control of one 
and the same puppeteer. 

The case for Sensory-Motor Capacity, if anything, is more dra- 
matic. There is no question that much of our ability to perceive the 
world and our body as part of it, must be innate (cf. T. G. R. Bower, 
1971, 1974; also the Chomsky and Piaget debate in Piatelli- 
Palmarini, 1980 and comments from the other participants). How- 
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ever, every normal person operates in ordinary experience by so 
many routines and patterns that it would be impossible to estimate 
how many distinct sensory-motor systems an ordinary individual 
possesses. There are sensory-motor programs for almost every imag- 
inable aspect of routine experience, chewing gum, brushing your 
teeth, grooming in general, dressing, tying your shoes, driving a car, 
riding a bicycle, playing basketball, going to class, giving a talk, writ- 
ing a letter, typing one, talking on the phone, etc., and each of these 
routines is divisible into subroutines of a great variety. 

To the extent that such programs can be made explicit as rule- 
governed systems, they are like grammars of natural languages. 
They also have their own sensory-motor texts, t SM1 , t s and so forth. 
For instance, our ability to recognize a game of basketball and to dis- 
tinguish it from a tennis match, or to distinguish either of these from 
a boxing match, is dependent in part on our knowledge of the corre- 
sponding sensory-motor systems. But none of these knowledge sys- 
tems is the same as an actual game of basketball, or tennis, or a par- 
ticular boxing match. Yet, the general rule-systems underlying the 
particular manifest forms (t SM 's in Figure 4) are at least as distinct 
from each other as are the diverse "textual" manifestations. Sensory- 
motor texts, in their turn, are also coordinated in ordinary experi- 
ence in delicately articulate ways with kinesic and linguistic texts. 

Because the information processing approach to the development 
of semiotic systems over time is discussed in Damico and Oiler (1991) 
along with a detailed analysis of some of the empirical evidences in 
favor of the theory, I will merely summarize those evidences here 
and will skip over much of the discussion given there (Damico and 
Oiler, 1991) of the theory from an information processing point of 
view. 

Empirical evidence in favor of the theory sketched out includes 
first, a plausible explanation of our ability to translate information 
from one semiotic system into another. Each of the universal systems 
of knowledge (and no claim is made as to the completeness of the 
ones postulated, only their necessity) though distinct, is related to 
the others through the domination of the general capacity, and each 
also subordinates one or more particular systems that are acquired 
and are to some extent conventional in character. For example, the 
acquisition of the primary language at once fleshes out the universal 
aspects of language that are realized in that system and at the same 
time results in the addition of conventional features that are unique 
to the primary language. Much the same will be true in the acquisi- 
tion of the kinesic system that accompanies the first language. Our 
ability to translate information from one system more or less ad- 
equately into another is indicative of the underlying general capacity 
that connects the different quasi-independent modules or in 
Gardner's terms "multiple intelligences." We can talk about what we 
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see or describe in words the meaning of a gesture, facial expression, 
or tone of voice. Or, we can visualize a scene as someone else de- 
scribes it, imagine a facial expression, tone of voice, or the like based 
on a linguistic representation. Paraphrase is included as a special 
case of such translations. We can also paraphrase meanings that 
have been expressed in a certain surface form by putting them into 
other surface forms that give more or less the same result. For in- 
stance, the statement that "Men are mortal" may be paraphrased by 
saying that "All humanity must ultimately face death" or that "Mor- 
tality is a trait of human beings," etc. Translation across distinct lan- 
guage systems, e.g., "Los hombres son mortales" or "La mortalidad es 
una de las cualidades de los hombres," or translation into any lan- 
guage or other form that can be imagined, is ample evidence in favor 
of a general factor of semiotic capacity. Apart from such a general 
capacity, such translations (even quite imperfect ones, much less 
fully satisfactory ones) would be inexplicable. 

I agree with Roid and Haladyna (1982) as well as Anderson 
(1972) who recommend the use of paraphrase in the testing of com- 
prehension of prose materials in a school curriculum. Roid and 
Haladyna (1982) say that "the reason for using paraphrase [in test- 
ing] is to ensure that students have truly comprehended the ideas... 
that they have not just recalled the wording at a surface level" (p. 
91). They quote Anderson (1972): "to answer a question based on a 
paraphrase, a person has to have comprehended the original sen- 
tence, since a paraphrase is related to the original sentence with re- 
spect to meaning but unrelated with respect to the shape or sound of 
the words" (p. 92). My point, however, is a little different than theirs 
as I am stressing the fact that all comprehension of a semiotic sort 
involves a sort of paraphrasing or translation into a different 
semiotic medium. This idea comes from Pierce and was viewed by 
Roman Jakobson (1980) as the special genius of the whole Peircean 
perspective on semiotics and linguistics. Jakobson commented that 
"the translation of a sign into another system of signs" as a definition 
of the process of interpretation was "one of the most felicitous, bril- 
liant ideas which general linguistics and semiotics gained from the 
American thinker" (p. 35). 

Now here is where the theory of Walters and Gardner runs into a 
difficulty: if there were really independent "intelligences," it should 
not be possible to translate very well from one to another. They, of 
course, admit that it is possible to do some such translation and yet 
at the same time see this as a bit of a "conundrum." They give an ex- 
ample of a non-mathematically inclined child who must master some 
mathematical principle. They say, after the mathematical approach 
fails, "the teacher must attempt to find an alternative route to the 
mathematical context — a metaphor in another medium. Language is 
perhaps the most obvious alternative, but spatial modeling and even 
a bodily-kinesthetic metaphor may prove appropriate in some cases. 
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In this way, the student is given a secondary route to the solution... 
perhaps through a medium that is relatively strong for that indi- 
vidual" (p. 20). What this potential detour to the difficult mathemati- 
cal principle shows is that it must be possible to some degree to 
translate between the different symbolic media. However, they sur- 
mise that "there is no necessary reason why a problem in one domain 
must be translatable into a metaphorical problem in another do- 
main... as learning becomes more complex, the likelihood of a suc- 
cessful translation diminishes" (p. 20). They assert, "the mathemati- 
cal principle cannot be translated entirely into words (which is a lin- 
guistic medium) or spatial models (a spatial medium)" (p. 19). How- 
ever, no proof of this has been offered, and Peircean theory shows 
that one of the properties of truly symbolic systems is their relatively 
perfect intertranslatability. While we cannot translate from an icon 
to an index, nor vice versa, nor can we always translate from a sym- 
bol to either an icon or an index, we can always translate from one 
symbol to another, and there is no limit to the accuracy of such sym- 
bolic translations. Furthermore, all indexes and icons are more or 
less translatable into symbols, though the reverse is sometimes im- 
possible. How, for instance, would you adequately represent the mor- 
tality of human beings by pointing to something in particular? Or 
what icon would show the full meaning of the symbolic proposition 
that humans are mortal? On the other hand, a verbal description 
may suggest an icon just as it may suggest a particular index. In 
fact, verbal descriptions can literally include icons and indexes 
within them so as to more or less completely usurp their special rep- 
resentational capacities. 

The fact that fairly complex translations are meaningful is dem- 
onstrated in the sort of research exemplified by Nolen and Haladyna 
(1990). They focussed on two types of study strategies that encourage 
"deep-processing" (their term): elaboration (e.g., "figure out how it 
fits in with what you learned in class") and monitoring ("asking your- 
self questions while you read to make sure you understand") (p. 117). 
They argue that "if students think the teacher wants them to under- 
stand material and relate it to their own lives, as well as to think cre- 
atively and independently about it, they will come to value strategies 
(like monitoring and elaboration) that lead to those goals" (p. 119). 
Now if translation of the sort that takes place between distinct 
semiotic media were not fairly good, it is difficult to see how "deep- 
processing" would relate to all of the diversity of concepts, illustra- 
tions, photographs, texts, experiments, etc. that constitute the cur- 
ricular bases for learning about science. In fact, the whole thesis of 
experience-based, socially relevant, whole language education, is 
grounded in the implicit assumption that meaningful connections 
and translations across distinct semiotic media are not only possible 
but more normal than the traditional analytic separation of those 
media into separate and independent categories. 
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Another evidence of the connectedness of the various disciplines 
summed up in Gardner's terms "literacy", "numeracy", and "critical 
thinking" (Gardner, 1990) is seen in a rare longitudinal study by 
Benbow and Arjmand (1990) involving 1,247 persons initially identi- 
fied in the seventh or eighth grade as "mathematically precocious". 
These individuals were observed again after they completed college 
to identify factors that contribute to high achievement in mathemat- 
ics and the sciences. In addition to finding that a high SAT score at 
age 12 was a good predictor of subsequent performance (however, a 
mediocre or low score did not yield much predictive value), the au- 
thors (Benbow and Arjmand) confirmed the observation of Walters 
and Gardner (1986a) that there was typically some "crystallizing ex- 
perience" (event or persons) that contributed to the educational de- 
velopment of the high achievers (p. 437), Two observations are sug- 
gested here: first, that testers cannot rely on negative evidence as 
much as positive evidence of abilities, and second, that influence 
stemming from interpersonal relations (a mentor or encourager) may 
have a profound influence on mathematical or scientific achieve- 
ment. Now this last outcome would seem to be excessively unlikely if 
the separate "intelligences" labelled "interpersonal" and "logical- 
mathematical" were truly quite independent. They have to be related 
via some form of intertranslatability. 

The semiotic model under consideration (Figures 3-5) also en- 
ables us to make certain distinctions that are, it would seem, critical 
to any theory of intellect that aims for explanatory adequacy (cf. 
Chomsky, 1965). For instance, we may distinguish innate from ac- 
quired knowledge. Innate knowledge is that which is present before 
any experience occurs, or which is triggered by experience and ma- 
tures more or less automatically and somewhat independently of ex- 
perience. Even sensory-motor systems have their noteworthy conven- 
tional aspects. For instance, to take a trivial but suitable case for the 
sake of illustration, in one culture it is customary for automobiles to 
drive on the right hand side of a roadway while in another motorists 
stay to the left. If it is hypothesized that conventional aspects of the 
various semiotic systems in question must be acquired, this sort of 
acquired knowledge will be distinguished from innate knowledge to 
the extent that the former is a product of experience involving the 
senses. It is suggested that information from the sensory-motor sys- 
tem passes to consciousness where the sensory-motor texts (i.e., se- 
quences of sensory-motor images) are interpreted. As they are under- 
stood, and just to that extent, they are passed through various stages 
of memory more or less distant from consciousness. The depth of the 
comprehension in question will determine the degree of impact on 
semiotic systems. It is hypothesized that the acquisition of grammar 
is a process of comprehending a particular kind of texts so as to de- 
velop the sort of intuitive feel which constitutes knowledge of a lan- 
guage. By this reckoning, the acquisition of a particular grammar is 
a process of comprehending texts in that language at a sufficient 
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depth so as to acquire the conventional aspects of the grammatical 
system. 

Contrary to a lot of recent speculation about non-primary lan- 
guage acquisition (e.g., Gregg, 1988), the theory under consideration 
hypothesizes that non-primary language acquisition will proceed in a 
manner much like primary language acquisition except for the fact 
that acquisition of a second language will benefit greatly (and suffer 
minor interferences from) the prior acquisition of the first language 
(Asher, 1969; Asher and Price, 1967; Asher and Garcia, 1969). Simi- 
larly, the acquisition of a third language will benefit (mainly, and 
suffer but little) from the first and second, and so on. The fact that 
non-primary language acquisition usually falls short of the mark 
achieved in primary language acquisition (Gregg, 1988), it is sup- 
posed, should be explained not by positing a radical difference in the 
physiology (Scovel, 1988) or even the internal strategies of the person 
involved in one or the other task (Selinker, 1972), but by noting the 
radical differences across the two cases in access to target language 
texts and the relative motivations to comprehend and produce them 
(Brown, 1973; Schumann, 1975; Vigil and Oiler, 1977). 

In the primary language situation, the person doing the acquisi- 
tion is under incredible community pressure to conform to the norms 
of the primary-language. A child who persists in non-conformities 
will bo ostracized or punished in ways that border on cruelty while 
the one who succeeds in overcoming them will be rewarded by all the 
privileges of membership in a community. For any one other than a 
child acquiring a non-primary language, no similar pressures or re- 
wards are likely to be experienced (cf. Brown, 1973; Schumann, 
1975; Vigil and Oiler, 1976; etc.). Exceptional cases, where non-pri- 
mary language acquisition succeeds in fairly dramatic ways are pre- 
cisely those cases where access to target language texts and suscepti- 
bility to pressures and rewards are both provided for. For instance, 
the person who marries across language boundaries and then moves 
to the country where the non-primary language predominates is far 
more apt to achieve native-like ability in the non-primary language 
than someone who merely takes a college course in that language. In 
fact, we are inclined to suppose, along the lines of Vigil and Oiler 
(1976) that continuing progress toward native competence in any 
language is much more a function of internally defined motives and 
sensitivities than it is a function of methods of teaching or modes of 
exposure. Clearly access to pragmatically rich and meaningful texts 
in the target language is requisite, but insufficient by itself. Motiva- 
tion to conform to the communal conventions of the target language 
system is also required. 

The hierarchical model under consideration not only supports the 
kinds of theoretical distinctions that are required in practice, e.g., 
the distinction between innate and acquired knowledge, conscious- 
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ness and memory, memory and grammatical knowledge, grammar 
and text, text and comprehension, comprehension and production, 
primary and non-primary language acquisition, etc., but it also sug- 
gests some fairly explicit hypotheses about relationships within the 
proposed hierarchy that are immanently susceptible to empirical 
testing. 

Since linguistic representations are the most abstract ones con- 
sidered in the model, it follows that the primary language is the most 
likely basis for the development of general semiotic capacity. Here I 
differ some with Walters and Gardner (1985, 1986a, 1986b). They 
seem to view "logical-mathematical intelligence" as distinct from "lin- 
guistic intelligence. 5 ' But, it has often been observed that logic and 
mathematics involve kinds of reasoning that are parasitic and de- 
rivative being entirely dependent upon language (Pierce, in 
Hartshorne and Weiss, 1931-1935; Lotz, 1951; Church, 1951; Russell, 
1919). Einstein alluded to the closeness of the relationship between 
language development and cognitive growth in general in the re- 
marks quoted above. It was a point developed further by Vygotsky 
(1934, 1978), Piaget (1947), Luria and Yudovich (1959) and Luria 
(1961). 

Further evidence may be seen in the remarkable accomplish- 
ments of deaf children with hearing parents. In cases where the chil- 
dren, for whatever reasons, are deprived of access to visual sign lan- 
guage they face a language acquisition problem far more difficult 
than that of the hearing child. Such children, it seems, face special 
cognitive difficulties that only the acquisition of a fully developed 
language system will enable them to overcome. Typically this is ac- 
complished through a natural visual-manual sign system such as 
American Sign Language (cf. Lane, 1984; Wilcox, 1988). (An interest- 
ing aside concerning such signed systems is that the primary role of 
language is assumed by gestures of the hands and body while the 
paralinguistic role of kinesics is taken over by speech and voice 
mechanisms.) Deaf children deprived of manual/visual sign system 
and forced to acquire speech directly are placed at a serious disad- 
vantage (Lane, 1988). The difficulties they face in cognitive develop- 
ment across the board are predicted by the hierarchical model under 
consideration. It follows that if children are deprived of full and rich 
primary language system that is accessible to them in terms of their 
sensory-motor system, they will suffer consequences of this lack 
throughout the cognitive hierarchy and especially in areas that de- 
pend on communication, e.g., social development. 

Moreover, children who acquire some ASL and are then taught 
Signed English (SE), an artificial system invented by hearing per- 
sons to correspond to English lexicon, syntax, and so forth, are ap- 
parently in the position of persons trying to acquire a second lan- 
guage system. In this instance, however, the system is artificial in a 
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variety of ways. For instance, in theory SE gives equal emphasis to 
stressed and unstressed morphological and lexical elements. In this 
respect, and others, it is somewhat like Morse Code or even Pig- 
Latin. Unlike ASL, SE is a largely depeii Jo nt system. Therefore, 
when deaf children de-emphasize or omit redundancies of English 
structure, e.g., the "-ing" of present progressives and the like, they 
are making natural modifications in surface forms of signed texts 
that would conform to more normal expectations about universal 
grammar. 

Another hypothesis that is suggested by the theory under consid- 
eration is that neighboring elements of the hierarchy are more apt to 
influence each other than distant ones. For example, the primary 
language would have greater impact on second language acquisition 
than on third. The second similarly would be expected to influence 
the third, even more than the first language would, and so on. Again, 
experience of polyglots bears this out. Typically, "padding" (a term 
from Newmark, 1966, i.e., the use of known language forms in place 
of target language forms) is usually from the most recently acquired 
language rather than from any other. 

Following out the same idea, transfer in general would be ex- 
pected to occur from the more developed systems to less developed 
ones. For example, the primary language would be expected to influ- 
ence a non-primary language rather than the reverse. The situation 
would be altered in favor of the non-primary language at just the 
point where the person in question achieved greater proficiency in 
the non-primary system. However, at just that point, the non-pri- 
mary system would be promoted to the status of the primary system 
and the former primary system would presumably be demoted to a 
secondary status. 

Another consequence of the postulated hierarchy is that distinct 
representational systems provide the means in some cases for com- 
prehending what would otherwise be incomprehensible. For in- 
stance, a discourse in a target language that might be entirely in- 
comprehensible if one had to rely on knowledge of that particular 
language alone can be made comprehensible if one has access to a 
translation provided in some other semiotic system. In normal lan- 
guage acquisition, e.g., primary language acquisition, as has often 
been pointed out (Macnamara, 1973, 1982) meanings of surface forms 
are often contextually obvious when those forms are being acquired 
(Krashen, 1985). The child first understands the context, e.g., by rep- 
resenting it in a comprehensible sensory-motor form, and subse- 
quently becomes able to understand the utterances associated with 
the context. In non-primary language acquisition, wherever it suc- 
ceeds, a similar scaffolding is often provided. It may be presented in 
some dramatization, in a film, or it may be presented through a 
translation, literally, into a language that the subject already knows. 




By this line of reasoning, Krashen's input hypothesis (Krashen, 
1985) is vindicated (Oiler, 1988). The input hypothesis in its most ba- 
sic form says simply that language acquisition progresses as the 
acquirer comprehends texts that are a little beyond his or her cur- 
rent level of development in the target language. Spolsky (1985) and 
Gregg (1988) have contended that the input hypothesis is either false 
or trivially true. If it means we must understand what is beyond our 
understanding, it is false. If it means merely that we must compre- 
hend in order to learn, it is trivially true. However, the theory we are 
advocating here disposes of both of these interpretations. We do in- 
deed understand representations (target language texts) beyond our 
reach in one system (namely the target language) by appealing to 
representations in another semiotic system. The one provides an in- 
terpretation of the other. Therefore, because of the 
intertranslatability of semiotic representations, the input hypothesis 
remains viable. 

Cummins (1976) proposed the threshold hypothesis, an idea that 
relates to the impact of bilingualism, or more specifically adding a 
second language, on cognitive development. Subsequently (see 
Cummins, 1984, pp. 107-108) he modified his hypothesis and ex- 
tended it. The threshold hypothesis suggests that the child's starting 
level of proficiency in one or both languages may be an important 
mediating variable in avoiding a burden in becoming bilingual or in 
benefitting from bilingualism once achieved. There are actually two 
thresholds being proposed. 

On the low end, it is claimed that a child may have to achieve a 
certain minimal level of proficiency in one or both languages in order 
to avoid deficits. In other words, if the child falls below threshold in 
both languages, presumably it will be difficult or even impossible for 
that child to benefit from instruction in either language. Further, it 
follows that a child who has not acquired threshold level in the pri- 
mary language will only receive an unnecessary additional burden 
by being instructed in a second language. Therefore, the lower 
threshold is presumably important in the determination of when in- 
struction might be beneficially introduced in a non-primary lan- 
guage. 

At the other end of the scale, a high threshold is also posited. In 
order for a bilingual child to experience the expected benefits of bilin- 
gualism, e.g., greater ability to appreciate and utilize symbols and 
greater "metalinguistic awareness," i.e., ability to appreciate the ar- 
bitrariness and conventionality of linguistic symbols, the child must 
have surpassed the high threshold presumably in one or both lan- 
guages. 

Admittedly, the idea of one or more thresholds is loosely stated, 
but the research seems to support it (Cummins and Mulcahy, 1978; 
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Duncan and DeAvila, 1979; Hakuta and Diaz, 1984; Kessler and 
Quinn, 1980). In fact, as Hakuta (1986; also see Lambert, 1975) has 
shown, there is a long history of debate concerning the deleterious 
versus beneficial effects of bilingualism. Formerly, especially in the 
U. S. there was a widespread prejudice against "bilingualism" based 
on research showing that minority language children got low scores 
on IQ tests. It scarcely occurred to the persons interpreting the re- 
search that the IQ tests were mainly measures of English language 
proficiency — something that the minorities in question had not yet 
had the opportunity to acquire. 

The main point here, however, is that the hierarchical model un- 
der consideration explains the available evidence concerning the 
threshold hypothesis and provides a convenient framework within 
which to understand the interrelationships of semiotic systems in 
general. Within a hierarchical model, the threshold hypothesis can 
be incorporated and elaborated in terms of transfer and interference 
and in terms of a more explicit theory of the role of language profi- 
ciency in relation to cognition in general, Bilingualism and indeed 
multilingualism deserve special consideration since they are bound 
to play a central role in the education of minorities. Moreover, the 
elaboration suggested by the theory under consideration is compat- 
ible, it seems, with the course that Cummins (1979, 1983a, 1983b) 
has begun to develop in terms of the CALP/BICS distinction. 

In response to consideration of the possibility of a general lan- 
guage proficiency factor, Cummins (1979) hypothesized a distinction 
between what he called cognitive academic language proficiency 
(CALP) and basic interpersonal communicative skills (BICS). This 
idea was appealing inasmuch as most any educator who has dealt 
with bilingual or multilingual contexts has observed ample evidence 
in its favor. A child that gets along satisfactorily on the playground, 
where cognitive demands are presumably lessened by the immediacy 
of physical and social context, may encounter difficulty in the class- 
room when it comes to reading, writing, solving word and math prob- 
lems, and in general interacting on a more abstract level. The child 
may have adequate BICS without sufficient CALP. This distinction is 
reminiscent of the sort of thing Gardner (1990) says in reference to 
representational systems that seem to be naturally acquired versus 
ones that need special "tutelage" — especially, "literacy, numeracy, 
and critical thinking" - the sorts of things that Cummins would 
group under CALP. Cummins (1983c), however, unlike Gardner and 
colleagues, clarified that he did not intend to argue that the two 
kinds of ability were unrelated, but rather that they were apt to ap- 
pear as such at the surface. To illustrate he adapted an "iceberg" 
model (from Shuy, 1978, 1981) where the two visible points, CALP 
and BICS, were clearly distinct, but were joined below the surface in 
what he called "common underlying proficiency" (cf. Cummins, 1984, 
p. 143). 4 
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There was a further implication that the two kinds of ability 
might be developed in somewhat different contexts and perhaps us- 
ing distinct strategies. Cummins (1983c) quoted David Olson (1977) 
who said: 

...language development is not simply a matter of progressively 
elaborating the oral mother tongue as a means of sharing inten- 
tions. The developmental hypothesis offered here is that the abil- 
ity to assign meaning to the sentence per se [as in a written text], 
independent of its non-linguistic context, is achieved only well 
into the school years (p. 275, cited by Cummins 1983c, p. 116, our 
interpolation). 

What Cummins and Olson apparently intend to emphasize is the 
greater degree of inference required to link up a written text with its 
author's intended meanings than is required in the case of an inter- 
active discourse in the here and now. The latter, presumably the 
typical context of the exercise of BICS, is less cognitively demanding, 
ceteris paribus, than the former, a typical context for the use of 
CALP. 

Within the more elaborate Peircean perspective proposed here, 
Olson's phrase "independent of its nonlinguistic context" might be 
reformulated as "without firsthand access to its nonlinguistic con- 
text." This seems to do no violence to Olson's intention, nor Cummins 
application of the idea in reference to CALP. However, it is a neces- 
sary modification if Pierce's foundational claim that all interpreta- 
tion is translation from one form of semiotic representation to an- 
other. This sort of translation is not viciously circular only because 
sensory-motor representations enable the investment of all other 
sorts of representation with material (non-empty) content. 

However, strictly speaking, there is no such thing as a meaning- 
ful "sentence" without a "nonlinguistic" context. With that in mind, 
we assume that Olson and Cummins might accept as a friendly 
amendment to their ideas the interpretation that CALP (or in Olson's 
case, literacy) requires a larger inferential leap from the perceptible 
form of a representation (a written text in the case under consider- 
ation) and an appropriate interpretation that associates it with expe- 
riential context. Failing this, it would have to be argued that a repre- 
sentation which has no inferential relation to any experiential con- 
text whatever is necessarily meaningless. It is entirely 
uninterpretable (cf. Einstein, 1944, in Oiler, 1989, p. 25, paragraph 
3.13; and Pierce, pp. 99-105 in Oiler, 1989). 

How then can the CALP/B1CS dichotomy be understood within 
the proposed hierarchical model? The overlapping part of the iceberg 
beneath the surface would be explained in part as the general factor 
of language proficiency which incorporates whatever aspects of gen- 
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eral intelligence are necessary to that proficiency. For BICS, also, it 
is clear that the utilization of both sensory-motor information and 
linguistically coded representations simultaneously would require a 
pragmatic linking that could only be accomplished by access to gen- 
eral semiotic ability. However, with BICS, sensory-motor information 
is immediately accessible to aid the pragmatic linkage. 

In the exercise of CALP, on the other hand, say in reading an 
unillustrated text, e.g., that which appears on this page, any neces- 
sary supplementary sensory-motor representations would have to be 
supplied by the reader. This is a more difficult semiotic task. It re- 
quires a higher degree of inference based on a more abstract semiotic 
system, namely a linguistic one, from which the sensory-motor type 
images must be inferred where they are needed. The move from 
graphological representations to a more abstract linguistic form is 
already a difficult inferential process (reading), and the absence of 
sensory-motor images that might give some clue concerning refer- 
ence, deixis, and the whole pragmatic mapping process involves an- 
other complex of inferences. 

Thus, CALP, with its special emphasis on literacy and abstract 
reasoning would presumably require the development of reading and 
writing skills in the primary or some non-primary language. 
Whereas BICS might benefit indirectly from such a development, lit- 
eracy and specialized abstract reasoning skills, e.g., ability to do 
arithmetic leading on to higher mathematical skills, would not be 
necessary to BICS. To this extent, BICS and CALP are usefully dis- 
tinguishable which suggests an important amplification of 
Cummins's threshold hypothesis - one that he has commented on 
(Cummins, 1984, p. 117). 

The initial distinction between "surface fluency" and "conceptual- 
linguistic knowledge" Cummins attributes to Skutnabb-Kangas and 
Toukomaa (1976). They, no doubt, were influenced by the distinction 
between "surface" structure and "deep" structure from Chomskyan 
linguistics. The idea was that a child might develop quite a lot of rou- 
tine facility with greetings, leave-takings, playground games, and 
the like, and still fall short of the level of language proficiency and 
concept development necessary to reading, writing, and doing arith- 
metic (or as Gardner, 1990, terms them "literacy", "critical thinking", 
and "numeracy"). Therefore, a child might appear to do well at con- ' 
versation but fail at school (Olson, 1977). 

The low threshold for language skill, then, might be construed as 
a completely general requirement applying as much to monolinguals 
as to multilinguals. Presumably this same notion was what another 
generation of specialists in another paradigm meant by "readiness". 
The higher threshold too would have a more general interpretation 
in this context. Presumably "metalinguistic awareness" is merely an- 
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other way of referring to what another generation of psychologists 
and educators called "learning to learn" or "talking about talk," etc. 



Finally, there is also a parallel with the traditional distinction 
between "language disorders" and "learning disabilities" where the 
former have been defined more in terms of surface language prob- 
lems (sometimes even speech difficulties per se) and the latter in 
terms of deeper conceptual difficulties - "neurological" deficits (see 
Coles, 1978; Cummins, 1986) or, more recently, "inefficiencies" 
(Swanson, 1988). Damico (1985b) has argued that traditional tests of 
language disorders have tended to focus on surface forms of language 
while definitions of learning disabilities have been defined, to the ex- 
tent they have been defined at all, in terms of deeper conceptual 
problems. Again, something like the BICS/CALP distinction appears. 
It is a virtue of the proposed model under consideration to be able to 
incorporate such distinctions and to elaborate upon them in intu- 
itively appealing ways. 

Table 1 
The Seven Intelligences 



Intelligence 



End-States 



Core Components 



Logical-mathematical 



Linguistic 



Musical 



Spatial 



Bodily»kinesthetic 



Interpersonal 



Intrapersonal 



Scientist 
Mathematician 



Poet Journalist 



Composer Violinist 



Navigator Sculptor 



Dancer Athlete 



Therapist Salesman 



Person with detailed, 
accurate self-knowledge 



Sensitivity to, and capacity to discern, 
logical or numerical patterns; ability 
to handle long chains of reasoning. 

Sensitivity to the sounds, rhythms, 
and meanings of words; sensitivity to 
the different functions of language. 

Abilities to produce and appreciate 
rhythm, pitch, and timbre; 
appreciation of the forms of musical 
expressiveness. 

Capacities to perceive the visual- 
spatial world accurately and to 
perform transformations on one's 
initial perceptions. 

Abilities to control one's body 
movements and to handle objects 
skillfully. 

Capacities to discern and respond 
appropriately to the moods, 
temperaments, motivations, and 
desires of other people. 

Access to one's own feelings and the 
ability to discriminate among them 
and draw upon them to guide 
behavior; knowledge of one's own 
strengths, weaknesses, desires, and 
intelligences. 
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To see better how the proposed hierarchy works in practice, and 
also to show how it can be used in the evaluation of other theories of 
intelligence, it may be useful to pause to examine more closely the 
model proposed by Gardner (1983, 1989, 1990) and colleagues (espe- 
cially, Gardner and Hatch, 1989; Walters and Gardner, 1985, 1986a, 
1986b). Table 1 gives a list of the seven "intelligences" that Gardner 
sees as somewhat independent of each other and yet as capable of 
characterizing of the sorts of individual configurations of abilities 
that he believes necessary to a more adequate conception of intelli- 
gence. While Gardner and colleagues speak as if their categories of 
"multiple intelligences" were thoroughly independent, they are upon 
examination hardly self-contained, independent modules, but rather 
complex composites of semiotic capacities in each case. Perhaps they 
are quasi- modular in character, but it is difficult to see them even in 
that way. Nevertheless, for the sake of demonstrating the intrinsic 
compatibility of the quasi-modular semiotic hierarchy I have been 
discussing here (Figure 4 above especially), I will fit Gardner's cat- 
egories in as shown in Figure 5 and will discuss them one-by-one in 
terms of the analysis given by Gardner and Hatch (1989) as well as 
my own semiotic characterization of their categories. 

The first category is what they call "logical-mathematical intelli- 
gence" which they describe (see Table 1 above) as pertaining to a "sci- 
entist" or "mathematician". It is generally agreed by professional lo- 
gicians and mathematicians (who have gained some awareness of lin- 
guistics) that logic and mathematics are both parasitic and derivative 
fields of study entirely dependent on human language abilities at a 
deep level. Therefore, I have placed Gardner's first "intelligence" as a 
node subordinate to the universal deep language system that is pos- 
tulated to underlie all abstract symbolic systems as well as natural 
languages. 

Gardner's second category, "linguistic intelligence" characterized 
in the special proclivities of a "poet" or "journalist" I have associated 
with primary language ability in the semiotic hierarchy. Gardner 
and Hatch (1989) give no indication that they have in mind any sort 
of polyglot, so I do not relate their category directly to the deeper 
level of universal language ability. That deeper level, I suppose, must 
undergird all abstract symbol systems such as mathematics, logic, 
and musical notation, as well as the abstract symbolic aspects of map 
making, diagramming, illustrating, and in general all forms of what 
Pierce called "abductive reasoning" (or what I term "pragmatic map- 
ping"; as diagrammed in Figure 3 above). 

Gardner's third category, "musical intelligence," as shown in the 
special abilities of a "violinist" or a "composer," I would place under 
the sensory-motor class of representations but with special connec- 
tions to deep language abilities and to kinesic abilities. While a vio- 
linist might not be a reader of musical notation, this is unlikely, and 
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a composer certainly would be a reader of music - hence the connec- 
tion with the abstract deep language node. In addition, a composer or 
a violinist would also be apt to understand the sorts of special ges- 
tural systems used by conductors (though neither of them might be 
conductors, a composer would be likely to have the capacity to con- 
duct one or more musicians in performing his or her music) - hence, 
the connection with the kinesic (significant gestural) node. 

Figure 5 
The Semiotic Hierarchy with 
Gardner's Seven Categories ('Multiple Intelligences 11 ) 
Added to the Picture 



The fourth kind of intelligence, "spatial," as represented in the 
special skills of a "navigator" or "sculptor" seems remarkably broad. 
Surely it covers a multitude of abilities. Among them would have to 
be found the sensory-motor elements pertaining to perspective and 
movement in time and space as well as a keen sense of proportion 
bordering on the mathematical. For the navigator, mathematical 
skills would surely come into play. For this reason, the "spatial intel- 
ligence" is connected both to the sensory-motor node and to the deep 
language node. 

"Bodily-kinesthetic intelligence," Gardner's fifth kind of intelli- 
gence, as seen in a "dancer" or "athlete" suggests a multitude of con- 
nections as well. If the dancer is a person who understands choreog- 
raphy or if the athlete understands demonstrations of various perfor- 
mances (e.g., how to serve a ball in tennis or how to do a single-leg 
sweep in wrestling), an implicit comprehension of diagrammatic il- 
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lustrations would probably come into play. Therefore, I have shown 
connections to the kinesic node as well as the sensory-motor, but no 
doubt if coaching comes into the picture, the language node should 
be connected as well. 

The sixth category, "interpersonal intelligence" as seen in a 
"therapist" or "salesman" suggests again an interesting composite of 
abilities. Since "moods, temperaments," etc. (as suggested in the de- 
scriptor of the category) are discerned largely through kinesic and 
paralinguistic systems such as gesture, tone of voice, facial expres- 
sion, and the like, the primary connection would be with the kinesic 
node. However, to the extent that all sales' pitches tend to rely on 
linguistic as well as other representations, at least the primary lan- 
guage system would come into play. Since Gardner and Hatch give 
no indication that the salesperson or therapist they have in mind is a 
multilingual, connections to languages other than the primary one 
are not shown, but a polyglot would no doubt have them. Therefore, 
it is clear that this module of "intelligence" would probably be heavily 
contaminated by one or more verbal components. 

The seventh category is the most problematic of all. Gardner 
calls it "intrapersonal intelligence" and suggests that it is the ability 
to understand one's own abilities. The sort of person having this par- 
ticular constellation of gifts is not only, we may suppose, a rare bird, 
but one who knows even more about him or herself than the people 
who are looking for him or her. That is to say, a person who under- 
stands his or her own abilities in the way described knows a good 
deal more than the measurement specialists do. This category, how- 
ever, I suppose would have to be linked directly to the deepest level 
of the semiotic hierarchy since it implies knowledge of all the nodes 
beneath it and of their interconnections. This final observation con- 
cerning Gardner's system also sums up my basic objection to it: the 
interconnections that must be posited if we are to understand how 
the various modules relate are missing. The sort of semiotic hierar- 
chy that I am proposing here, however, would supply at least some 
plausible alternatives for such connections. 

One of the most difficult things to see about language proficiency 
is that it may (perhaps must or at least ought to) be conceptualized 
in a considerable variety of different but mutually compatible ways. 
Walters and Gardner (1985) assert that "a particularly high level of 
ability in one Intelligence, say mathematics, does not require a par- 
ticularly high level of ability in another Intelligence, like language or 
music. This independence of Intelligences contrasts sharply with tra- 
ditional measures of IQ that find high correlations among test 
scores" (p. 13). I agree in large measure with what they are saying 
provided we modify the word "independence" to "quasi-independence" 
or something of the sort. 
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Figure 6 

Language Proficiency Viewed as a Composite of 
Domains of Grammar 



Language Proficiency 




Pragmatics Semantics Syntax Lexicon Morphology Phonology 



With respect to language proficiency per se, it is possible to think 
in terms of the various components of grammar (Figure 6) that con- 
stitute it in theory, or we may think of language proficiency in terms 
of the traditional skills (Figure 7). Or, we may choose any number of 
other angles or combinations of them. What is difficult to see is that 
these are not incompatible ways of viewing the phenomena of inter- 
est - merely different ways. If we focus on primary language ability 
as represented in Figure 4 above, that portion of the diagram might 
be amplified as shown in Figures 6 or 7. In Figure 6, language profi- 
ciency is seen as divisible, more or less, into domains of grammar. 
Pragmatics may be defined as pertaining to those aspects of meaning 
that have to do with actual, particular, concrete contexts of experi- 
ence. Semantics embraces those aspects of meaning that are virtual, 
universal, or abstract. Syntax is concerned with the sequential or si- 
multaneous arrangement of categories of grammar into texts. Lexi- 
con comprises those inventories of elements that are acquired as 
whole units, e.g., words, idioms, pat phrases, verbal routines, and the 
like. Morphology in English is a question of inflections, e.g., plural- 
ization, tense and number marking on verbs, etc., and derivations, 
e.g., adding a morpheme to make a verb of an adjective, e.g., "real" 
plus "-ize" to get "realize," and so forth. Phonology is a matter of de- 
termining the surface forms of phonemes, syllables, lexical items, 
and larger units of structure. 
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Figure 7 

Language Proficiency Viewed as a Composite of 
Quasi-Independent Skills 



Language Proficiency 




Listening Speaking Reading Writing Signing Verbal Thinking 

Figure 7 shows a similar breakdown with reference to skills such 
as listening, speaking, reading, writing, and verbal thinking. It may 
be argued without risk of contradiction that such hypothetical do- 
mains of structure, or distinct skills, are as valid as the theories upon 
which they are based. However, such divisions can never be finally 
determined anymore than Immanuel Kant could determine once for 
all the ultimate categories of reason. As Pierce, Einstein, and others 
have shown, such categories are intrinsically arbitrary and cannot be 
finally fixed or completely determined by any amount of empirical 
research (see especially Einstein, 1941, 1944, and Pierce, 1878, 
1906). While it may be possible to fix upper and lower limits within 
which the simplicity/complexity of the model must fall, its specifics 
will apparently always retain a substantial arbitrariness nonethe- 
less. 

For instance, there is no conceivable argument that would prove 
either of the componential breakdowns of Figure 6 or 7 to be intrinsi- 
cally superior to the other. For one purpose one model might be pre- 
ferred, for some other purpose, another. What is more, many other 
componential models may be conceived. For example, modes of pro- 
cessing (productive versus receptive) may be distinguished, modali- 
ties of processing (articulatory/auditory versus visual/manual), 
stages of processing (consciousness, short-term, long-term memory), 
etc. In principle, there are an infinite variety of possible componen- 
tial models. The answer, therefore, to the advocates of multiple intel- 
ligences (e.g., Gardner, Walters, and other collaborators) is that 
there is no single arrangement that will be completely satisfactory. 
Within the proposed hierarchy, this fact can be construed as a natu- 
ral outcome of different ways of combining and/or parsing up various 
of the proposed elements. 

While it was long maintained that cognitive development may be 
hindered by becoming bilingual, the evidence clearly points in the 
other direction (cf. Hakuta and Diaz, 1984; Cummins, 1984, 1986; 
Hakuta, 1986). Dabbling in non-primary language acquisition may 
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have little or no impact on intellect, but the acquisition of a second or 
third or fourth language to a substantial degree of proficiency is apt 
to result in significant, though modest, cognitive gains. In particular, 
the evidence seems to suggest that bilinguals achieve some kinds of 
flexibility in reasoning and a capacity to appreciate certain kinds of 
abstract relations that might remain outside the reach of some 
monolinguals. This result (see the research cited above with refer- 
ence to the "threshold" hypothesis), is predicted on the basis of the 
hierarchy under consideration. 

Moreover, as in the case of the threshold hypothesis, a more gen- 
eral hypothesis is suggested. If bilingualism contributes to mental 
growth only after some threshold is passed, it follows that simply at- 
taining proficiency in one's primary or native language must be im- 
portant to normal mental maturation. Further, if language is a win- 
dow through which researchers may get a fairly clear look at the 
mind, a thesis Chomsky has been pushing lately, it follows that the 
development of language proficiency must be linked to normal cogni- 
tive development. Putting this hypothesis in its most general form 
(Oiler, 1991) following Pierce, Einstein, and others, it is possible to 
predict that the normal development of deep semiotic abilities must 
depend in subtle ways on the development of the primary language. 
This has been demonstrated above in part by the differentiation of 
iconic, mdexical, and symbolic representations. Because of its greater 
abstractness (i.e., symbolic character), language has certain capabili- 
ties that the other representational systems lack. Among them is the 
potential for deep level semantic representations that are quite ab- 
stract (i.e., relatively uncontaminated by the two kinds of degeneracy 
associated with icons and indexes). As a result, only deep language 
ability is logically a medium that might serve for the development of 
the most general sort of intelligence. For an elaboration of this idea 
and a content analysis of so-called "non-verbal" IQ tests showing that 
they require such deep propositional or semantic reasoning, see Oiler 
(1991). 

While it may be possible for deep semiotic abilities to be devel- 
oped to a high degree with reference to sor^e other manifest form, 
say, sensory-motor representations, since linguistic representations 
achieve a more complete level of logical abstractness and conven- 
tional arbitrariness, it seems likely that in normal human beings lan- 
guage development in all of its diversity is the fulcrum on which in- 
tellect attains its greatest leverage. It also follows that language 
abilities will tend toward the center of any definition of human 
exceptionalities ranging from giftedness in all its varieties to disabili- 
ties of all types. 
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(4) Recommendations for Testing 
(and Teaching) LEP Students 

Cummins (1986) writes, "Historically, assessment has played the 
role of legitimizing the disabling of minority students. In some cases 
assessment itself may play the primary role, but more often it has 
been used to locate the "problem' within the minority student..." (p. 
29). This process may not have been intentional, but the effect has 
been summed up by Chase (1977) in a single phrase. He called it "the 
biologizing of social problems" (cf. Coles, 1978, for concurrence). 

Not to deny the fact that some children may indeed have genuine 
"neurological" or other "deficits" or even "abnormalities," Cummins 
still contends that the medical "diagnosis/prescription" paradigm has 
seduced a whole generation of educators and clinicians, and that in 
many cases children from minority language backgrounds have been 
ludicrously over-represented in deficit categories (e.g., see Ortiz and 
Yates, 1983). It is the purpose of this section to discuss these facts in 
light of the proposed model of semiotic abilities and to show some of 
the ways that the whole process of assessment might be upgraded 
and set on a path of self-correcting research and progressively 
greater adequacy. 

It is difficult to over-estimate the pervasive influence of analytic, 
discrete-point thinking in the study of exceptionalities. Its main 
manifestation is the search for specific, particular, unique sources of 
difficulty in individual cases. Swanson (1988), for instance, stresses 
the aim of the learning disabilities paradigm to achieve "specificity" 
(p. 197) - a concept that is elaborated throughout his informative ar- 
ticle. This means focussing on "specific mental processes" in instruc- 
tional remediation and determining unambiguously that "the process 
under investigation is responsible for performance" (p. 200). The idea 
of a "generalized deficit," he says, "undermines an important tenet of 
the field" (p. 197). He complains that "there is a lack of theoretical 
integration in the choice of measures in subtyping studies, and non- 
operational definitions of LD exist (Shepard and Smith, 1983). Fur- 
ther," he complains, "there is no agreed upon or satisfactory method 
for determining subtypes (McKinney, 1984)" (p. 197). 

The demand, therefore, appears to be for more specific diagnosis 
and more specific remediation. These goals were characteristic of the 
discrete-point language theory of the 1960s in second and foreign 
language testing. Swanson (1988) shows that this same sort of think- 
ing is current in the study of learning disabilities when he says, 
"Simply stated, a learning disability reflects a cognitive deficit...that 
is reasonably specific to a particular domain (e.g., reading). 3 The spe- 
cific deficits displayed by such children must not extend too far into 
other domains of cognitive functioning. If they did, the concept of a 
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learning disability would be meaningless..." (p. 196; his italics). How- 
ever, Swanson goes on to observe that in fact "the literature has un- 
dermined the concept of specificity" (p. 197). 

If we accept the major premise of Swanson that "the LD field is 
directed by social consensus" (p. 196), then it would follow that "the 
literature" which both establishes and defines the "consensus" could 
perhaps happily be redirected. However, I believe that it is not the 
"literature" per se that has "undermined the concept of specificity" as 
if there had been an active conspiracy against the "social consensus" 
that defines "the field of learning disabilities" (all the quoted terms 
being from Swanson, 1988). The evidence is simply against the idea 
of specificity in the way that it has been put forward. As argued ex- 
tensively above, a more comprehensive and integrated view of 
semiotic capacities is needed to incorporate and explain rather than 
deny or purge the data of existing research. 

A pragmatic approach, along the lines described above will be re- 
quired, and the goal of isolating highly specific elements of cognition 
will generally have to be abandoned as a logical mistake. Cognition 
by its very nature involves the differentiation of specific elements 
only in rich and dynamic tensional contexts in which those elements 
find their distinctive identities. Apart from such contexts, those spe- 
cific elements do not exist. This has been the primary motivation for 
clinical discourse analysis (Damico, 1985a, 1985b), an approach 
which seeks to understand the actual dynamics of the communicative 
performances of children rather than to pigeon-hole them into ready- 
made categories that may turn out to be altogether inappropriate in 
many cases. Discrete elements of cognitive processing only attain the 
character that really defines them in the contexts of their dynamic 
tensional oppositions in relation to each other and the whole con- 
tinuum of experience (see the voluminous writings of Pierce on this 
matter as represented in collections by Burks, 1958; Hartshorne and 
Weiss, 1931-1935; Pisch, et al., 1982; Moore, et al., 1984; and Oiler, 
1989). 

What about the current consensus that defines and purports to 
identify children with language disorders and/or learning disabili- 
ties? While the latter category has come more by tradition than by 
evidence to be associated with "neurological impairment", the idea 
that the former category is a subset of the latter is merely a matter of 
definition. The distinction between the larger category, learning dis- 
abilities, and the subcategory, language disorders (cf. Rueda and 
Mercer, 1985; also Cummins, 1986, p. 29), is merely assumed to be 
generally valid. 4 The distinction is never demonstrated by factual 
evidence anywhere in the vast literature on learning disabilities. In 
addition to a critical examination of this distinction, therefore, I won- 
der about the social consensus that sustains (Swanson, 1988) the 
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whole field of special education and the study of exceptionalities in 
general. 

As soon as the National Advisory Committee on Handicapped 
Children (1968) launched the first sentence of its long-standing defi- 
nition of "learning disabilities" the confounding of that term with 
"language proficiency" and therefore with "language disorders" 
should have been abundantly apparent. From there forward, the 
problem of providing a theoretically adequate basis for the sought 
after distinctions only becomes more confused. They wrote: 

Children with learning disabilities exhibit a disorder in one or 
more of the basic psychological processes involved in understand- 
ing or using spoken or written languages. These may be mani- 
fested in disorders of listening, thinking, talking, reading, writ- 
ing, spelling, or arithmetic. They include conditions which have 
been referred to as perceptual handicaps, brain injury, minimal 
brain dysfunction, dyslexia, developmental aphasia, etc. They do 
not include learning problems which are primarily due to visual, 
hearing, or motor handicaps, to mental retardation, emotional 
disturbance, or to environmental disadvantage (p. 4). 

What is remarkable is that a vast number of workers could be 
encouraged to entertain the illusion that the kind of thinking ex- 
pressed by the NACHC (and similar bodies) was a sufficient founda- 
tion on which to erect the present superstructure of the vast and 
growing edifice of special education. 

Coles (1978) reviewed ten of the most widely used procedures for 
identifying children with the sorts of "disabilities/disorders" suppos- 
edly defined in the previous paragraph. He examined the Illinois 
Test of Psycholinguistic Abilities, Bender Visual-Motor Gestalt Test, 
Frostig Developmental Test of Visual Perception, Wepman Auditory 
Discrimination Test, Lincoln-Oseretsky Motor Development Scale, 
Graham-Kendall Memory for Designs Test, Purdue Perceptual Motor 
Survey, Wechsler Intelligence Scale for Children — Revised, neuro- 
logical evaluations, and electro-encephalograms. These were found to 
be the most common procedures in use for the identification and di- 
agnosis of learning disabilities in most states. 

The sad conclusion was that "the predominant finding in the lit- 
erature suggests that each test fails to correlate with a diagnosis of 
learning disabilities" (p. 326). Neither was there evidence of correct 
diagnosis in the results of therapeutic interventions: "In experiments 
where the dysfunction itself was treated, there was little success" (p. 
326). While correlation alone is never proof of a causal relation, the 
absence of correlation is fatal to theories about specific causal con- 
nections. At the end of his article, Coles asserted, somewhat optimis- 
tically it would seem in retrospect, that "there is little question that 
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eventually the tests reviewed here will be discarded; the evidence 
against them is mounting" (p. 335). If we think in terms of centuries 
rather than decades, this statement may yet turn out to be correct. 
At the moment, the tests in question are probably being used in 
about as many states and in far more cases in 1991 than they were in 
1978. 

When it comes to the subset of learning disabilities known as lan- 
guage disorders, there is even more confusion, if that is possible. The 
deep underlying question is what do tests used to define language 
disorders (and learning disabilities) really measure? The theory is 
that they should measure something over and above whatever intel- 
ligence tests measure. According to most researchers they are sup- 
posed to identify actual "neurological impairments" or at least "neu- 
rological inefficiencies" (Swanson, 1988). 

However, if we take a paradigm exemplary test such as the Illi- 
nois Test of Psych olinguistic Abilities (Kirk, McCarthy, and Kirk, 
1968), it turns out to be notably ineffective in predicting even read- 
ing scores ?f we control for IQ. Newcomer and Hammill (1975) re- 
ported that the correlation between ITPA scores and reading scores 
evaporated when intelligence was used as a covariate. Our point 
here is not to defe 1 IQ tests as such (on the contrary, see part B be- 
low), but to show how confounded the constructs of language disor- 
ders, learning disabilities, and IQ are with each other. Moreover, we 
are arguing that all of these constructs have tended to overlook what 
is probably the single most important mediating variable, namely, 
primary language proficiency. 

In general there has been a consensual distinction between 
"mental retardation" and "minimal brain damage" or "neurological 
impairment." Mental retardation is supposed to be related to, among 
other things, scores below some arbitrarily established level on stan- 
dardized IQ scales. We, like Cummins (1984, see note 9, p. 30), do not 
deny that brain damage occurs in some cases or that mental retarda- 
tion is in some instances a useful designation. What we do question, 
on the other hand, is whether these categories can be and are ad- 
equately distinguished on the basis of the present approach to IQ 
measurement and learning disabilities diagnosis (also see Mercer, 
1973; Briere, 1973). There is substantial evidence that the distinction 
is thoroughly confounded in large numbers of cases. For instance, 
children identified as having "learning disabilities" in many cases are 
well below average in IQ scores. Out of 3,000 "learning disabled" chil- 
dren (identified as such in twenty-one states), more than a third fell 
below 90 on the standard IQ scale (Kirk and Elkins, 1975). 

Why would educators tend to place at least some "mentally re- 
tarded" cases in the "learning disabled" category? It is clear that the 
former category is more stigmatized than the latter, and that the 
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compassionate diagnostician, psychologist, or whatever, will prefer 
the less damaging label. But the problem surely runs much deeper 
than this. Beers and Beers (1980) point out that in some school sys- 
tems a fourth to a third of the total school kindergarten population is 
being flagged as "potentially" learning disabled. This seems odd 
when a dramatically smaller percentage of the population is apt to 
have either genetic or acquired physical disabilities. Cummins (1984) 
aptly describes the category of "learning disabled," therefore as "a 
dumping ground for a wide variety of learning and behavioral diffi- 
culties" (see also Hallahan and Cruickshank, 1973). Swanson (1988) 
confesses that there is not a single trait, nor even a cluster of them, 
that can be identified as common to the category. 

Undoubtedly it was because of the profound degree of confusion 
about the relation between mental retardation and learning disabili- 
ties that the American Association of Mental Deficiency arbitrarily 
changed the definition of "mentally retarded" from one to two stan- 
dard deviations below the mean on a standardized IQ scale 
(McKnight, 1982). Cummins (1984, p. 83) sees this change as moti- 
vated by the desire to reclassify large numbers of formerly "mentally 
retarded" children as "learning disabled." A question that immedi- 
ately arises is what such a change means in reference to the underly- 
ing constructs of intelligence versus neurological impairments. Be- 
yond this, there is the lingering question of how language proficiency 
may be construed as relating to either of these constructs. What is 
disturbing is that in their educational applications both constructs 
are becoming, it would seem, increasingly folkloric and arbitrary. 

Traditionally the identification of children with "language disor- 
ders" or "communicative disorders" or the general run-of-the-mill 
class of "learning disabilities" has been based on fairly superficial, 
surface-oriented criteria. For example, traditional diagnosticians 
have asked whether or not a child appropriately uses plural nouns 
(e.g., "dogs" versus "dog"), possessives (e.g., "Jim's hat" versus "Jim 
hat"), third person singular non-past verbs (e.g., "he walks" versus 
"he walk"), past tense verbs (e.g., "wanted" versus "want"), noun-verb 
agreement (e.g., "I am" versus "I be" or "I is"), irregular verbs (e.g., 
"fell" versus "failed"), number concord (e.g., "these cats" versus "this 
cats" or "these cat"), auxiliaries (e.g., "they have gone" versus "they 
gone," "they be gone," or "they done gone"). With respect to phonol- 
ogy, clinicians have tended to emphasize such things as the various 
forms of the regular plural morpheme in English (viz., Az/, As/, or A 
A z/) and the similar variations that occur in possessive marking of 
nouns, the third person singular non-past marking of verbs, the con- 
tractions of "is" and "has," and the similar variations that occur in 
marking of regular past-tense verbs (viz., Ad/, At/, or A A d/). 

Of course, surface form has some significance in its own right, 
but it has been elevated in the traditional tests, measurements, and 
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diagnostic procedures of speech-language pathologists to such a posi- 
tion of prominence that the deeper purposes, the pragmatic aims of 
communication have been overlooked. As a result, "language disor- 
ders" have typically been defined in terms of superficial elements of 
syntax, morphology, and phonology, and more often than not have 
been strictly limited to problems of speech and writing rather than 
deeper aspects of the production and comprehension of meaningful 
discourse. Not only has the diagnostic definition of "language disor- 
ders" qua "learning disabilities" been based on surface-oriented crite- 
ria traditionally, but the treatment of them has likewise focussed on 
"intensive instruction in phonics" and "perceptual training" (cf. Beers 
and Beers, 1980, p. 73). The remedies, like the diagnoses, have been 
largely ineffective (Coles, 1978). 

When attention is turned to discourse processing and to prag- 
matic criteria that have the potential at least of tapping into the 
deeper conceptual processes that underlie it, it is expected that the 
identification of genuine communicative difficulties, the kind that 
are apt to influence academic achievement in dramatic ways are 
more apt to be turned up (Damico and Oiler, 1980; Damico, Oiler, 
and Storey, 1983; Damico, 1985a, 1985b; Damico and Oiler, 1986; 
McCord and Haynes, 1988 5 ). This is not to say that researchers are 
presently in a position to determine on the basis of any existing test- 
ing program the specific neurological correlates of a given perfor- 
mance. This may be possible in rare cases but is certainly not the 
norm. Rather, as Coles (1978) intimated, there are no fully developed 
"less well-known instruments standing in the wings" (p. 335) and 
ready to fill the present void of thoroughly validated diagnostic pro- 
cedures. As Coles said, "These tests, in any case, do not yet exist" (p. 
335), and even the theory for their development is largely lacking. 

What chiefly stands in the way of the needed theoretical and 
practical development is the uncritical acceptance of the present "so- 
cial consensus." If researchers and practitioners alike are willing to 
acquiesce to the status quo of existing categories such as "language 
disorders," "learning disabilities," "mental retardation," and in gen- 
eral to the whole "diagnosis/remediation" paradigm, the needed re- 
form of theory and practice is bound to be delayed if it ever comes at 
all. As Cazden (1985) has argued, the labeling of minority children 
especially as "disabled" or "disordered" must be, in her words, 
"delegitimized" and this can only be accomplished by looking to the 
broader context of socialization and education as has been argued by 
Coles (1978), Cummins (1984, 1986), and by Oiler and Perkins 
(1978). 

Based on all of the foregoing a few heuristic guidelines may be 
offered. Since the damage is likely only in cases of disabilities rather 
than giftedness, we concentrate on the former. To begin with there 
are logically just four types of errors to be avoided: (1) a LEP may be 
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wrongly identified as disabled; (2) a truly disabled LEP child may be 
left out of the disabled category; (3) a LEP child may be incorrectly 
classed as a non-LEP; or (4) a non-LEP may be classed as a LEP. 

It is known that large numbers of errors of type (1) are occurring. 
Many LEPs are incorrectly being diagnosed as disabled, or otherwise 
retarded. It follows from the same studies documenting type (1) er- 
rors that type (2), disabled LEPs not being identified as such, must 
also be common. Error type (3), LEPs incorrectly classed as non- 
LEPs, seems most likely when in Cummins' terms a child has devel- 
oped substantial BICS in English but not much CALP. In these cases 
educators are apt to be fooled into thinking the child is ready for lit- 
eracy in English when the child is still below threshold even in his or 
her primary language. Error type (4), non-LEPs classed as LEPs, can 
also occur if the child is evaluated on the basis of limited BICS while 
well-developed CALP in the child's primary language may be over- 
looked. The likelihood of a growing number of misclassifications of all 
four types is on the upswing due to the increasing number of non- 
English speaking minorities in our schools. 6 

To minimize errors of all four types a series of assessment phases 
is recommended. In all phases, the pursuit of evidence concerning 
the child should be treated in a matter-of-fact manner and with a 
view to the advocacy of the interests, needs, and feelings of the child 
above those of the school or the diagnostician. Our purpose as educa- 
tors should be to promote and guard the interests of the child, not 
those of some abstract political or educational entity such as a state, 
institution, profession, or psychological yardstick (Cazden, 1975; 
Coles, 1378; Cummins, 1986). 

First, to distinguish LEPs from non-LEPs, a variety of sources of 
evidence should be considered, e.g., talk with the child, observe the 
child's behavior in casual contexts, talk with siblings, parents, 
friends, etc., where appropriate. Ask about literacy and previous edu- 
cational experience. Keep in mind that superficial, routine verbal 
skills may be deceptive in two ways: (a) they may lead us to attribute 
more language ability than is really present, or they may seem to in- 
dicate a low level of academic readiness when in fact the child is al- 
ready literate in one or more other languages. Clear-cut cases may be 
decided on the basis of this preliminary phase to be either LEP or 
non-LEP. Doubtful cases should be referred to the second phase of 
assessment. 

Two kinds of doubtful cases may be distinguished. Children with 
substantial educational background, e.g., those who have attained 
literacy in one or more other languages, but who lack basic routine 
skills (BICS) in English constitute the first case. These children 
should be evaluated with reference to their attainment in their most 
developed or primary language(s). For instance, some Asians will 
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prove to be weak in English but literate in French and possibly some 
other language. To determine this fact may require additional inter- 
views and possibly testing in the primary language. The question to 
be addressed in these cases is presumably, would it best serve the 
interests of this child if he or she were mainstreamed? If Cummins 
(1984 and elsewhere) is correct in the threshold hypothesis, only chil- 
dren who have demonstrated fairly advanced literacy skills or other 
abstract linguistic capabilities should be mainstreamed. 

The other kind of doubtful cases referred from phase one would 
include the children who appear to have substantial ability to per- 
form routine tasks in English (BICS) but who may or may not be 
ready for academic mainstreaming. The determination here, as in all 
cases, should be based on the solution that is believed most likely to 
benefit the child optimally. Preferences on the part of the child, and 
or the child's parents, should be weighed together with further evi- 
dence concerning academic readiness. The latter should be evaluated 
mainly in terms of the child's ability to perform abstract reasoning in 
the primary language and/or in English. Again, if Cummins (1984) is 
on the right track and if the theory as discussed above is followed in 
a general way, well-developed abstract reasoning capacities in one 
language will easily transfer to another assuming that there are no 
affective or social barriers 7 actively interfering with the process. In 
short, presumably some of these children should be mainstreamed, 
and some should not. 

Phase three concerns children who have been identified as LEPs 
needing some kind of special program to enable them to profit opti- 
mally from their on-going educational experience. The objective dur- 
ing this phase is to differentiate children who are ready for a normal 
course of instruction in their primary language and those who may 
need some extra help beyond this. The latter are those traditionally 
labeled "learning disabled." 

At this point, teachers or competent para-professionals who know 
the primary language(s) of the children should have already been in- 
volved and now become the main assessors. They should be trained 
in the deeper kinds of language assessment procedures that look to 
discourse/text-based tasks that include the broad range of communi- 
cative activities that school children are becoming able to engage in: 
e.g., relating an experience, singing a song, reading and reacting to a 
story, drawing a picture to illustrate some idea, explaining an illus- 
tration, evaluating a facial expression or gesture in a filmed narra- 
tive, play or drama, writing a letter, answering an advertisement, 
etc. The list of tested activities should be as broad as the curriculum 
children are expected to cope with. As suggested by Damico, Oiler, 
and Storey (1983) and elaborated by Damico and Oiler ( 1985) as well 
as Damico (1985a, 1985b, 1991) LEP students should be assessed in 
all of their languages and in each case across the broad spectrum of 
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abilities so as to identify strengths. The objective at all points along 
the way should be not to look merely at surface forms but to look 
mare deeply into the pragmatic aspects of discourse processing. 

If there is even the slightest clue that the child is bilingual or 
multilingual every effort must be made to test the child in his or her 
strongest language(s). Some probing on this point may be necessary 
since it may not occur to the child, or to his parents, to tell some 
teacher or diagnostician, "By the way, I can read and write in Man- 
darin." They ma^ not see this fact as relevant in an English speaking 
society or school. It may, however, be of considerable importance to 
an appropriate assessment of the child's actual capabilities. If a "dis- 
ability" is suspected, where children are thoroughly bilingual or even 
multilingual, it is mandatory to assess their abilities in each of the 
languages they know. Usually this will involve only English and one 
other language, but in exceptional cases three or even more lan- 
guages might be involved. To make a convincing case for a "learning 
disability," it is necessary to show that problems appearing in one of 
the child's languages also appear in the other. 

There is no theory of language acquisition that will support the 
thesis that "learning disabilities" will only be manifested in French, 
or any other particular language. Deep semiotic processing problems, 
the kind that affect language capacity in a general way, or possibly 
other semiotic representational processes as well, are bound to mani- 
fest themselves in a variety of ways and cannot logically be limited to 
just one of a multilingual's languages. On the other hand, if problems 
are just apparent in one of two or more language systems a child pos- 
sesses, it follows that the difficulties are likely to be within the nor- 
mal range experienced by second language learners and that no real 
"learning disability" exists at all. 

Phase four, for children identified as having special semiotic 
problems in more than one language or other semiotic modality, 
would involve a complete discourse analysis along the lines of 
Damico (1985a, 1985b, 1991) leading into recommendations for 
therapeutic intervention of an appropriate sort. At this point assess- 
ment merges with instruction (alias therapy) so completely that the 
two can no longer be profitably distinguished. 

It would seem that procedures for intervention could benefit as 
much from an investigation of language instructional methods that 
work (cf. Oiler and Richard-Amato, 1983; and Richard-Amato, 1988) 
as assessment of abilities and disabilities of LEPs could from the 
findings of language testing research. More particularly, pragmati- 
cally motivated procedures that deal with problems in the full rich- 
ness and scope of normal experience will have a far better chance of 
success than discrete-point oriented procedures that are generally 
acknowledged to be recipes for failure (see Coles, 1978). 
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Here are a few heuristic guidelines for assessment in general. 
Samples of discourse, or assessment procedures themselves, should 
always involve performances in engaging contexts of semiotic repre- 
sentation. Wherever possible a variety of sources of evidence should 
be examined, e.g., multiple languages, dialects, kinesic representa- 
tions, and sensory-motor performances. The objective should always 
be to find the child's optimal capabilities not to define some set of dis- 
abilities. Judgments should never be considered final but should be 
subject to constant updating, revision, and rechecking. No single test 
should form the basis for assessment. It should not be the basis for 
any final judgment. In the final analysis our goal is to set the child 
up for success, not for failure. 

Notes 

interestingly, Olson (1986) goes even further than Oiler (1981). 
Subsequently, however, I believe we have followed the same river of 
thought (see Oiler, 1989; Olson, 1986; Langer, 1987; and Sternberg, 
1987). 

2 According to an unpublished study reported on at this meeting by 
Dr. Sherry R. Migdail, as few as 50 out of 1,000 students in a typi- 
cal middle America school district were observed to have some form 
of genuine special education need (e.g., mental retardation, lan- 
guage-disorder/learning-disability, etc.). Yet, as Dr. Alba Ortiz ob- 
served in her presentation at this conference, a far higher percent- 
age of students are misidentified as needing special education. 

3 Of course, the implication that a term like "reading* (or "listening 5 
or even "spelling," all of which occur elsewhere in Swanson's paper) 
can be construed as "specific" is absurd on its face. Reading is as 
complex as any process known to modern science. Neither is it dis- 
tinguishable except in superficial ways from all that accompanies it 
- reasoning, arguing, imagining, etc. To suggest that such a pro- 
cess achieves the sought after "specificity" is to reveal the shallow- 
ness of thinking that characterizes the whole "social consensus" 
that constitutes "the field of LD". 

4 Cummins (1986) cites Rueda and Mercer (1985) who claimed that 
the distinction between "learning disabled" and "language disor- 
dered" for minority children is typically a matter of whether there 
is a "psychologist" or a "speech-pathologist" on the placement com- 
mittee. Cummins concludes that the distinction is essentially arbi- 
trary" (1986, p. 29). 

5 It should be noted that the latter authors, according to their own 
bibliography, only had access to summarial presentations of the 
pragmatic criteria they attempted to employ. Also, they compared 
only 12 "learning disabled" children as determined by the criteria 
set by the State of Alabama with 12 normals defined as such in 
view of their performance at "expected academic grade level". The 
authors apparently assume, without justification, that the children 
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identified by the state's criteria really are "learning disabled," but 
this is precisely the premise that needs to be questioned. Unless 
independent evidence of "learning disability" exists in those 12 chil- 
dren, evidence that would be missing for the "normals" against 
whom they are to be compared, the pragmatic criteria for evalua- 
tion cannot be tested with the experimental design that was in fact 
employed. In the final analysis, only some of the pragmatic criteria 
proposed by Damico and company did discriminate between the 
"disabled" and "normal" groups. However, this may be as much a 
consequence of group selection as of the criteria. Besides, it has 
been argued that significant difficulties can be expected for children 
that depart substantially from the norm on any one of the prag- 
matic criteria under consideration. 

6 Note that we do not use the term "disabled" here to legitimize it, 
nor do we agree that children in general to whom the label is at- 
tached are as it describes them. Our point here is to enable all chil- 
dren, LEPS and non-LEPs, normal and exceptional, to have access 
to the full range of educational benefits to which they are legiti- 
mately entitled. 

; Krashen (1981, 1982, 1985) has argued that affective resistance to 
normal second language acquisition may occur in high anxiety or 
otherwise disturbing contexts. Assuming he is correct in this, every 
effort should be made to avoid the kinds of social conditions that 
might constitute or at least augment the mounting of such barriers. 
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Response to John Oiler's Presentation 



Fred Davidson 
University of Illinois, Urbana/Champaign 

Well, half of me wants to say sort of this is really easy because I 
agree. I do agree very deeply. The title of my talk is, "From the 
Trenches." In this paper for this meeting, John Oiler has presented 
a thorough, theoretical and philosophical basis for motivated, pro- 
active change in the assessment of language minority studenu in the 
United States. In my reaction, I shall do two things. Fin going to 
briefly summarize and interrupt his main points, give you a glimpse 
of the rest of that 66-page document, and then attempt to relate his 
philosophical stance to the pragmatic necessities of language minor- 
ity students. Now, it is this second section that has caused me to 
title my paper, "From the Trenches." Oiler's work is broad reaching 
and provocative. From my background as a language tester, who has 
worked with small and large language assessment data sets, I have 
decided to challenge myself and discuss how his proposal might be 
implemented in the front line trench battles of language testing in 
school setting. 

First the summary: Oiler's paper has three parts. In the first 
part, he reviews primary and non-primary language testing litera- 
ture. He discusses the heritage which language testing shares with 
intelligence measures as well as the historical link of language test- 
ing with structural linguistics. These two trends are primarily re- 
sponsible for the prevalence of discrete-point testing approaches, and 
the question John raises or implies many times is - is it appropriate 
to consider language ability as the sum of many parts? By the sec- 
ond section of the paper, Oiler's beliefs are clear, when on page 20, 
he says, "Happily a movement toward pragmatic, holistic testing is 
now discernable." 

Much of the first section of his paper seems directed at this con- 
clusion - a conclusion which I share very deeply. Language ability is 
indeed a complex mental trait and holistic integrative testing should 
hold forth more than it does. Oiler says near the end of his paper, "It 
is difficult to over estimate the pervasive influence of analytic, dis- 
crete-point thinking in the study of exceptionalities." And it is pre- 
cisely this pervasive influence that I've taken as my mandate: how to 
expand the framework of language assessment measures in the real- 
ity of school based decision making, and I am going to return to this 
later. 

Second, Oiler has a review of relevant points from the recent his- 
tory of educational measurement. He cites Roid and Haladyna, 
"There is a chance for endless mapping sentences, facts, and facet 
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elements with lack of agreement among developers being a major 
determent to progress." And then he goes on to say," when the focus 
is shifted from a list of items, (which is a poor characterization in any 
case of any non-finite domain of sentences) to the generative basis 
which underlies the representations that constitute that domain, we 
have some hope of achieving reliability and validity." The multi- 
facet nature criterion reference measurement or, strictly speaking, a 
domain referencing, is anathema to good language testing, Oiler 
seems to say. I agree generally with this, but I suggest that criteria 
can also be holistic, and I've done some work in the design and imple- 
mentation criterion reference test specifications that are pragmatic 
and holistic. The other major component of this section is citation of 
the work of Gardner on multiple intelligences as that is central to 
part three. I want to deal with it in my discussion of that part. 

In part three, Oiler sketches his own model of human systems of 
representation, and there are three diagrams in there ending at the 
one that integrates Gardner with Oiler. He calls this his own gen- 
eral semiotic capacity model. This is by far the most philosophically 
challenging section of the paper. My impression is that Oiler is ex- 
panding on the notion of a general factor of language to encompass 
multiple components, and that he is utilizing Gardner's work to do 
so. Oiler now views language as a global factor that contains compo- 
nents. This is very clear in the paper and this is very welcome. He 
closes with a series of assessment recommendations for teaching and 
testing language minority students. Much of this discussion centers 
around the nature of disabilities. He claims that the handling of lan- 
guage minority students has been heavily conditioned by the history 
of measurement, of language disorders, and/or learning disabilities. 
I have seen this, first hand, in my work with K through 12 ESL and 
bilingual data in the state of Illinois. I agree heartily. 

He has several recommendations at the very end, including the 
use of "pragmatically motivated procedures" that deal with the prob- 
lems in the full richness and scope of normal experience as well as a 
call for multiple measures about which I will speak specifically be- 
low. He seems to regret the difficulty of implementing change in lan- 
guage minority student education. The ease with which a disability 
or remediation paradigm can rule the day prompts him to say, "What 
chiefly stands in the way of the needed theoretical and practical de- 
velopment is the uncritical acceptance of the present social consen- 
sus, if researchers and practitioners alike are willing to acquest to 
the status quo of existing categories like language disorders, learning 
disabilities, mental retardation, and in general to the whole diagno- 
sis and remediation paradigm the needed reform of theory and prac- 
tice is bound to be delayed if it ever comes at all." He is challenging 
our field then to find a way to break the uncritical acceptance of the 
status quo, and I'd like to take up that challenge in part in the next 
section. 
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So, a voice from the trenches. 



Now, the issue here, it seems to me, is that we need to get inside 
the head of the people that matter. All assessment is done within the 
context of decision making. There is a real good paper by Jack 
Upshur from 1970 and Lyle Bachman extends it in his 1990 text- 
book. The person making the decision may or may not be a test de- 
signer and, if so, may or may not subscribe to the philosophical shifts 
which Oiler promotes, and with which I heartily agree, so this begs 
the question, why? What causes the acquiescence that bothers John 
Oiler and bothers me? Let me offer a practical, real world answer. I 
believe that we need to legitimize the change necessary for the as- 
sessment of language minority students. This legitimization requires 
two components. First, full-proof argument and second, logistical 
ease, i.e., that the new must be as easy to implement as the old. 
First, full-proof argument should affect assessment score users on a 
philosophical strong ground as Oiler has done as well as be an el- 
egant simplicity, and I'd like to offer an example of the later. Draw- 
ing heavily upon an excellent paper in Language Testing by Mats 
Oscarson, 1989. I highly recommend it. Oscarson argues that if 
modern language teaching is more focused on the learner then the 
learner should be consulted in the assessment process. He argues, 
therefore, that language testing should include self-report. At the 
very beginning of his paper, Oscarson notes that there are funda- 
mentally two types of assessment, external and internal. The 
former, external, is imposed from outside of the learner. Most tests 
are actually external. The latter are self-report of some sort or an- 
other and reflects the internal goals, agenda, and motivations of the 
learner, goals which may or may not match the external tests. 
Oscarson's paper closes with samples of self-report and language 
testing and the particular appropriates of those samples to K 
through 12 is not really relevant here. What is at issue here is the 
undeniable simplicity of Oscarson's argument. The differentiation of 
assessment into self and non-self in my eye is equal to the philosophi- 
cal paradigm shift that separated criterion referencing from norm 
referencing. Hudson and Lynch, Lang ua ge Testing , 1984 and Glaser 
1963, whom they cite, note the following about the difference be- 
tween norms and criteria. They note that if achievement happens in 
the classroom then a normalizing test will actually unskew a curve. 
All teachers after they teach want people to achieve. Apply a nor- 
malizing norm referenced test to that, and you will actually convert 
it back to a Bell Curve. Affectively, the achievement will be statisti- 
cally squashed. That's a powerful argument which appeals to teach- 
ers everywhere. I maintain that Mats Oscarson's argument, that 
testing needs to be internal and external, is equally simple and pow- 
erful. 



Several years ago, I was fortunate to be in a seminar with John 
Oiler at UCLA. There 1 presented a case for something I then called 
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and still call "multiple referencing" which is a super-ordinate term to 
link criterion referencing and norm referencing and other references, 
as yet to be determined. I believe that self-report is actually a form 
of test reference, call it self-referencing, on equal stature to that of 
criteria and norms. Furthermore, I believe the simple elegance of 
Oscarson's argument elevates self-referencing to the status of norms 
and criteria. The simple, elegant, undeniable elevation of the new to 
the status to the old is one crucial component to breaking the acqui- 
escence which Oiler condemns. In this particular instance of pro- 
posed change, I believe multiple referencing is not really a new con- 
cept just a new term. Oiler even appeals for it at the very end of his 
paper, as have many others who have used terms like multiple crite- 
ria and multiple indicators, and I have a whole scad of references 
here on that. I maintain that terms like multiple indicators and mul- 
tiple criteria help us see multiple sources of evidence within a certain 
score reference, but why not attack the number of references as well, 
and that's why I proposed self-referencing. 

But this isn't enough. An argument in favor of expansion of the 
number of score references, which in essence, John does at the end of 
his paper, is not the only necessity by far. We need to make the 
change work. I often pose the following question to my language 
testing students. Two situations; Situation A: You are an adminis- 
trator at a school, a decision maker. You have 900 new international 
students arrive at your school, and you must decide their English 
proficiency. You consult a single norm referenced discrete-point test 
score. Situation B: You are the same person. You have 900 new in- 
ternational students arrive at your school, and you must decide their 
English proficiency. You consult a single norm referenced test score, 
a single criterion referenced test score, and you interview each stu- 
dent for self-report. The issue is that the entire technological history 
of logistical ease and human measurement is intertwined with the 
summative discrete-point test score. We cannot get away from what 
we appear to do so well. Clinical, detached, quasi objective discrete- 
point norm referenced testing, we have that down pat. A couple of 
years ago at a conference, I met Edward DeAvila, a developer of the 
LAS assessment battery. He showed me a computer expert system 
program to help a decision maker navigate multiple information 
sources, some of which constituted multiple references. As I recall, 
he had both norms and criteria, and I think that this program was or 
was a refinement of one developed for the Chicago Public School Sys- 
tem. Now, I'm not proposing, necessarily, that a computerized ex- 
pert system can automate the navigation of multiple references and a 
broader range of what John calls pragmatically motivated proce- 
dures, but I do claim that unless we do something to break the logis- 
tical strangle-hold of norm referenced summative discrete-point 
tests, we are doomed to fail. Let's hit them with both barrels. Let's 
use the elegant simple arguments, and let's make routine the com- 
plexity of dealing with language testing as it should be dealt with. 
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In closing, I would like to echo the sentiment of Anne Frank, "I 
do believe that people are basically good at heart." ...despite the way 
this sounds. I do agree with Anne Frank. People are basically good 
at heart, and this includes the staunchest decision maker/addicts of 
norm referenced test scores. I believe, rather, that what happens is 
not that they consciously reject the persuasion of Oscarson, Oiler, 
and others, but rather that such change is felt to be logistically im- 
possible. Let's work on that feeling. 
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Response to John Oiler's Presentation 



Myriam Met 
Montgomery County Public Schools, Maryland 

I agree with both John Oiler and Fred Davidson. Fm just a 
simple practitioner, so what Fd like to do with you this morning is 
try to extrapolate some of the implications from foreign language 
practice, from the paper that John has written, and from the re- 
marks Fred has shared with you this morning. Fd like to talk briefly 
about the notion of global testing of proficiency and tie that to what I 
think is a more important and valuable trend for all of us, which is 
classroom assessment of language skills. 

The first part, global proficiency, I think, draws from the buzz 
word in the foreign language profession today (and it has been for 
the last decade), which is "proficiency." You might find this defini- 
tion interesting because it's a somewhat different view of the term 
"proficiency" from the one that I was familiar with when I worked in 
ESL and bilingual programs (about six years ago). In foreign lan- 
guage proficiency, one is never "proficient." One is only proficient to 
perform certain tasks or language functions, in certain contexts or 
settings about certain topics or contents, and with a degree of both 
linguistic and socio-cultural accuracy. To some extent, all of us are 
limited proficient in that none of us, even the most ideal, (but non- 
existent) educated native speaker is ever completely proficient to per- 
form all language tasks, in all contexts, in all contents, with the 
same degree of linguistic and socio-cultural accuracy. That is an im- 
portant concept to which I'll come back in a little while when I talk 
about classroom proficiency and some definitions. 
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In the 1980s, the American Council on Teaching of Foreign Lan- 
guages ACTFL undertook, along with the Educational Testing Ser- 
vice, to develop a global proficiency measure, which was called, not 
surprisingly, the ACTFL/ETS oral proficiency rating scales. What's 
interesting about the scales, for those of you who are not foreign lan- 
guages professionals, is the fact probably, that for the first time, in 
the history of foreign language teaching in this century, there exists 
a common metric for the assessment of secondary and post-secondary 
students, a standardized instrument that allows everyone to agree on 
what the terms mean. The term "proficient," then, never really 
meant proficient to do everything all the time, everywhere, in every 
way possible, but simply to perform certain tasks in certain settings 
at a certain degree of accuracy as defined by the scales. That doesnt 
mean that everyone agrees that the scales themselves are perfect. 
There's no general consensus that this is the only reliable measure. 
In fact, there's a great deal of debate raging over the content validity 
of the proficiency scales. But one of the points, which is important 
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for non-foreign language professionals to note, is that this is one in- 
strument that everybody can focus their attention on and begin to 
talk about as a way of looking at student performance. 



I bring that up because in a previous life, one which I enjoyed a 
great deal and miss a great deal, I worked with ESL and bilingual 
programs. One of the greatest frustrations was the lack of appropri- 
ate instruments to find out what children knew and were able to do. 
In proficiency testing, one is always focusing on what the learner can 
do, under what circumstances, and how well. Whereas, when I 
worked in ESL and bilingual education, I was never quite sure what 
the tests were really supposed to be testing. One advantage that 
those who work in the assessment of language minority children 
should have over foreign language professionals is in the area of 
identifying goals and objectives. The purpose for assessing English 
language skills should be to find out if students have acquired the 
English skills necessary for successful academic performance at or 
above grade level. In contrast, in foreign language, we very rarely 
know what students are going to be able to do with their language 
skills. We don't know the purposes to which they will put their lan- 
guage skills. It's really hard to figure out how to find out what chil- 
dren know when you really don't know what you expect them to 
know and be able to do in the first place. And if you really don't 
know what you want them to do, how do you know what to teach 
them? If you don't know what to teach them, it's awfully hard to de- 
cide how to test them. That should not be the case when we work 
with language minority students, because we are very clear about 
what we want them to be able to do. We want them to be able to suc- 
ceed in school. John Oiler has said it very well: "Language is the 
key to successful endeavors, especially, in the school setting." If we 
know what kids are supposed to be able to do, then why aren't we 
finding out if they can do it? 

It seems to me that entry and exit decisions were based on the 
wrong things when I worked in ESL/Bilingual education. If you 
want to know whether a student can perform well academically, the 
first thing you probably ought to know is: "What are the demands of 
the academic curriculum from the language perspective?" At every 
grade level and in every content domain, that may differ; therefore, a 
student at the third grade level may need to understand this much 
English, speak this much English, read that much English, write 
that much English. It might be different for a fourth grader learning 
social studies or a second grader learning science. Yet, the tests we 
were working with all looked at students' oral production. Some of 
them are very discrete point, such as whether a student could dis- 
criminate between the sounds of yellow and jello* Except when we 
teach the concept "matter changes form," we never use the term jello 
in the third grade. Yet, discriminating yellow from jello was a ques- 
tion on the test, and whether you got to stay in the program de- 




pended on whether you understood the difference. That kind of 
decontextualized assessment of language seemed to be irrelevant to 
what we needed the kids to be able to do. 

As an ESL program director, I was always terrified when the 
children took their language proficiency tests. Part of me really 
wanted them to do well, because that was what our program was all 
about — helping them to succeed. We wanted every child to do as well 
as possible. But this little voice inside of me said, "Oh, if they do 
well, we won't be able to help them anymore." Because no matter 
what the tests said, I knew that some of those children weren't quite 
ready to make it on their own. It seemed to me an awfully silly way 
to make a decision about who gets in, who stays in, and who gets out. 

It's all that which brings me to my central argument. 

The most promising way, then, to address the concern of the ap- 
propriate assessment language proficiency is through 
instructionally-based assessments, such as the ones we have been 
hearing about at this symposium and certainly the ones I think are 
relevant from my experience with foreign language immersion pro- 
grams. In foreign language immersion programs, students learn 
content through a language in which they have limited skills. Im- 
mersion teachers are responsible for ensuring that their students 
achieve the objectives of the school curriculum while gaining skills in 
a new language. In this respect, their roles and responsibilities par- 
allel those of teachers who work with language minority students. 

For the last four and a half years, I have been involved in a 
project to identify the training needs of foreign language immersion 
teachers and to help develop training materials to meet those needs. 
I would be the first to say, and I really want to stress this, that for- 
eign language immersion is not the same as ESL or bilingual educa- 
tion, (nor should it be), but the needs of the teachers who work in 
these fields are similar in that they're all engaged in teaching con- 
tent in a language that is new to their students. Also, some of you 
who are working in the field of developmental bilingual education 
may find it interesting to hear some of the training issues that are 
involved in foreign language immersion. We have been helping 
teachers learn how to teach in these foreign language settings and to 
find out, indeed, if children are learning. In this project, I have come 
to believe that the teaching of language and content should be in- 
separable. I am going to say that again, because I think that is the 
most important thing I have to say today. The teaching of language 
and content ought be inseparable. Language is learned best through 
a context and a content, particularly when the aim of the language 
program is to enable students to be successful academically in their 
new language. John Oiler has just told us that language is impor- 
tant to all educational endeavors, and that to separate language from 
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meaning, language from thought and cognition, and from content, is 
to make a mockery of the business that we're all about. 

Language objectives and content objectives must be tied to one 
another. Both sets of objectives must be considered when planning 
for teaching and when planning for testing. We tell our teachers 
that planning for testing takes place at the time that you plan for 
your teaching. Teachers must identify the language demands of the 
curriculum and plan to include means for students to gain in lan- 
guage as they grow in concept attainment. Anne Snow, Fred 
Genese, and I have suggested elsewhere a model for the integration 
of language objectives with the teaching of content, and visa versa, 
and have demonstrated how the roles of the ESL, the bilingual 
teacher, the mainstream teacher, and the foreign language teacher 
are fulfilled within that framework. Fm not going to go into that pa- 
per here, but I do want to stress the importance of teaching language 
through content and the importance of considering every content les- 
son a language lesson as well. 

Teaching and testing go hand and hand. As John points out in 
his paper, (a point he didn't mention this morning), testing activities 
should be as broad as our teaching activities. In fact, planning for 
testing and planning for teaching need to be done at the same time. 
Effective foreign language immersion teachers begin to plan by first 
thinking about what they want students to learn. Then, when they 
know what they want students to learn, they've got to figure out how 
they're going to know it when they see it. If they know what they 
want children to learn and how they're going to find out if they have 
learned, then the next step also falls in line, which is, how you're go- 
ing to get children ready to show you or to perform their knowledge. 
Those are the enabling activities; that's the teaching part. So, learn- 
ing and teaching and testing all belong together. Good immersion 
teachers, then, are able to ensure that their objectives, their teach- 
ing, and their testing all fit together, because they see them as inex- 
tricably tied to one another. And, they define their objectives, teach- 
ing, and testing both in terms of content and language. 

Since teaching concepts in a new language often requires that 
immersion teachers use visual and other concrete experiences during 
instruction, it follows that similar approaches are appropriate when 
testing students. Students should have access to materials that help 
them show the teacher what they know, even when they can't always 
tell her. 

In assessing students, immersion teachers are most concerned 
with finding out what students have learned and allowing students 
to demonstrate what they have learned. The emphasis is on what 
students can do and do know , not on what they don't know and can't 
do. John told us in his paper that what we need most in our profes- 
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sion are integrative tests that tie teaching to learning to assessment. 
In foreign language immersion, classroom-based language assess- 
ments that are conducted as part of the instructional delivery system 
serve a number of important masters. First, they seem to be the 
most appropriate way of finding out whether students have the lan- 
guage skills needed for academic performance, precisely, because the 
assessment ties language to its purpose, which is content learning. 
These assessments are authentic in that they measure student profi- 
ciency in the real contexts in which language use occurs. They're in- 
tegrated and assess the range of skills needed in the classroom for 
successful academic performance. These tests, in essence, have con- 
tent validity, because, (as I heard the term used yesterday), they 
"test the right thing." Classroom-based performance assessments 
put the focus where it belongs, on student growth. Performance as- 
sessments such as portfolios, systematic observation, and teacher 
evaluations of student products and projects are effective ways to 
find out about student progress in relation to the objectives we've set 
for them. Because they're based on student performance; they show 
us what students can do and do know, and they compare each stu- 
dent to his or her last performance, they only compare students to 
themselves, not to some idealized and probably non-existent average 
student or native speaker. 

Last, they're the most appropriate way of ensuring that the deliv- 
ery of instruction is commensurate with the linguistic proficiency of 
the student at that point in time and in that content domain. 

From the day to day instructional perspective, the marriage of 
language assessment with content assessment helps teachers, 
whether they're foreign language immersion teachers or those who 
teach language minority students, engage in a constant formative 
diagnostic feedback loop. In our training of foreign language immer- 
sion teachers, we emphasize the importance of surveying students' 
background knowledge prior to introducing a new concept. Every 
teacher does that but, for foreign language immersion teachers, this 
also means that they must know the range of the students' linguistic 
ability to handle the concepts. The teacher needs to know the lan- 
guage demands of the curriculum objectives and the extent to which 
special strategies, manipulatives, and concrete materials will be nec- 
essary for instructional delivery. 

Immersion teachers are content teachers, but they're also lan- 
guage teachers. We believe that every content lesson should be a 
language lesson as well, and that foreign language immersion teach- 
ers need to plan as conscientiously for language growth as they do 
for content. In part, planning for language growth means the 
teacher must be continuously assessing where students are in rela- 
tionship to where they ought to be and using that assessment data to 
identify areas where further development of language growth is 
q needed. 
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It's clear, then, that as instruction progresses and as teachers ob- 
serve the growth of students, a great deal of assessment data can be 
collected about the achievement of both content and language objec- 
tives. These data provide important information about each indi- 
vidual student but, in the aggregate, data from systematic observa- 
tions, checklists, portfolios, and teacher-made tests also provide in- 
formation about the effectiveness of the instructional program. 

In conclusion, trends in foreign language teaching and testing 
have two major implications for the assessment of language minority 
children. One is, perhaps, a different definition of proficient - a 
recognization that all language users, both native and non-native, 
are differentially proficient to perform language tasks in different 
settings and at varying levels of performance. None of us is com- 
pletely proficient, and this definition of proficiency renders the no- 
tion of limited proficiency almost meaningless, as a system of catego- 
rizing learners. Language minority students bring with them a rich 
resource in their home language and culture. The label, "limited pro- 
ficiency students," as John tells us in his paper, only perpetuates a 
deficit model of instruction and relegates ESL and bilingual educa- 
tion to a compensatory role. Perhaps a more useful way at looking at 
proficiency, as in the newer definition in foreign language, is to de- 
scribe what learners can do, under what circumstances, and how 
well. For language minority students, defining proficiency in terms 
of classroom language - the tasks, the functions, the contexts, and 
the contents in which they must perform -- will allow us to focus as- 
sessment measures where they belong, on academic performance. 
The second implication of foreign language immersion is that the 
teaching and testing of English in ESL bilingual programs must be 
integrated with the content students are to learn. If the teaching 
and testing of English were more intimately tied to the learning of 
content, we might more effectively integrate teaching, learning, and 
assessment. To paraphrase the late Ron Edmonds, (and I'm sure 
many of you have heard this before) "All children can learn. All chil- 
dren must learn, and all teachers must learn to teach (...and I'll 
throw in my paraphrase, and equitably assess) all children." 
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